Re: [I] Improve error message for wrong argument type in operators [datafusion]

2026-01-26 Thread via GitHub
alamb commented on issue #11250: URL: https://github.com/apache/datafusion/issues/11250#issuecomment-3800806875 I think that would be a good idea -- the key thing to make this happen is find a maintainer that is willing to work with you on it, as well as breaking up the work into smaller ch

Re: [I] Parallelize `list_files_for_scan` [datafusion]

2026-01-26 Thread via GitHub
BlakeOrth commented on issue #19971: URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3800841802 I'm sure there are performance gains to be had during the file listing phase of a cold query. I'm skeptical (read: actual evidence of performance improvement should be require

Re: [PR] chore: Start 0.14.0 development [datafusion-comet]

2026-01-26 Thread via GitHub
comphead commented on code in PR #3288: URL: https://github.com/apache/datafusion-comet/pull/3288#discussion_r2728640407 ## dev/diffs/iceberg-rust/1.10.0.diff: ## @@ -25,7 +25,7 @@ index eeabe54f5..867018058 100644 caffeine = "2.9.3" calcite = "1.40.0" -comet = "0.8.1" Rev

Re: [PR] Blog post about CASE optimization [datafusion-site]

2026-01-26 Thread via GitHub
rluvaton commented on code in PR #122: URL: https://github.com/apache/datafusion-site/pull/122#discussion_r2728685836 ## content/blog/2026-01-26-datafusion_case.md: ## @@ -0,0 +1,456 @@ +--- +layout: post +title: Optimizing SQL CASE Expression Evaluation +date: 2026-01-26 +autho

Re: [PR] Physical-level placeholders [datafusion]

2026-01-26 Thread via GitHub
askalt commented on PR #20009: URL: https://github.com/apache/datafusion/pull/20009#issuecomment-3800971416 cc @Omega359 (I remember you were also interested in placeholders within physical plans). -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] perf: Optimize ArrowBytesViewMap with direct view access [datafusion]

2026-01-26 Thread via GitHub
Tushar7012 commented on code in PR #19975: URL: https://github.com/apache/datafusion/pull/19975#discussion_r272632 ## datafusion/catalog-listing/src/table.rs: ## @@ -712,20 +712,21 @@ impl ListingTable { }); }; // list files (with partitions) -

[PR] feat: window RANGE calculation optimization [datafusion]

2026-01-26 Thread via GitHub
akoshchiy opened a new pull request, #20014: URL: https://github.com/apache/datafusion/pull/20014 ## Which issue does this PR close? - Related #15607 . ## Rationale for this change As discussed in #15607, range calculation is one of the bottlenecks in window processi

Re: [PR] fix: Make `generate_series` return an empty set with invalid ranges [datafusion]

2026-01-26 Thread via GitHub
nuno-faria commented on code in PR #1: URL: https://github.com/apache/datafusion/pull/1#discussion_r2729021932 ## docs/source/library-user-guide/upgrading.md: ## @@ -154,6 +154,29 @@ The builder pattern is more efficient as it computes properties once during `bui Not

Re: [PR] feat: Add V2 scan support for `native_iceberg_compat` [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove closed pull request #3272: feat: Add V2 scan support for `native_iceberg_compat` URL: https://github.com/apache/datafusion-comet/pull/3272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] feat: Add V2 scan support for native_datafusion [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove closed pull request #3276: feat: Add V2 scan support for native_datafusion URL: https://github.com/apache/datafusion-comet/pull/3276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] feat: add support for `width_bucket` expression [datafusion-comet]

2026-01-26 Thread via GitHub
davidlghellin commented on PR #3273: URL: https://github.com/apache/datafusion-comet/pull/3273#issuecomment-3801327868 ```sh # Spark 3.4 (width_bucket NO) ./mvnw test -Pspark-3.4 -Dsuites="org.apache.comet.CometMathExpressionSuite" -Dtests="width_bucket" # Spark 3.5 ./mvnw t

Re: [PR] docs: fix bug in placement of prettier-ignore-end in generated docs [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove merged PR #3287: URL: https://github.com/apache/datafusion-comet/pull/3287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] chore: Start 0.14.0 development [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on code in PR #3288: URL: https://github.com/apache/datafusion-comet/pull/3288#discussion_r2728648180 ## dev/diffs/iceberg-rust/1.10.0.diff: ## @@ -25,7 +25,7 @@ index eeabe54f5..867018058 100644 caffeine = "2.9.3" calcite = "1.40.0" -comet = "0.8.1" Re

Re: [PR] chore: Start 0.14.0 development [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove merged PR #3288: URL: https://github.com/apache/datafusion-comet/pull/3288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
alamb commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2728566147 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(ar

Re: [PR] chore: Add take/untake workflow for issue self-assignment [datafusion-comet]

2026-01-26 Thread via GitHub
comphead merged PR #3270: URL: https://github.com/apache/datafusion-comet/pull/3270 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] bug: Fix string decimal type throw right exception [datafusion-comet]

2026-01-26 Thread via GitHub
coderfender commented on code in PR #3248: URL: https://github.com/apache/datafusion-comet/pull/3248#discussion_r2729367411 ## native/spark-expr/src/conversion_funcs/cast.rs: ## @@ -2351,73 +2347,98 @@ fn parse_string_to_decimal(s: &str, precision: u8, scale: i8) -> SparkResult

[PR] chore(deps): bump org.assertj:assertj-core from 3.23.1 to 3.27.7 [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3293: URL: https://github.com/apache/datafusion-comet/pull/3293 Bumps [org.assertj:assertj-core](https://github.com/assertj/assertj) from 3.23.1 to 3.27.7. Release notes Sourced from https://github.com/assertj/assertj/releases";>org.asser

Re: [PR] Various performance improvements [datafusion]

2026-01-26 Thread via GitHub
Dandandan commented on PR #20013: URL: https://github.com/apache/datafusion/pull/20013#issuecomment-3801914569 run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729394863 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729394863 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729394863 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [I] Parallelize `list_files_for_scan` [datafusion]

2026-01-26 Thread via GitHub
Tushar7012 commented on issue #19971: URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3801934957 ## Rationale As discussed in #19971, [list_files_for_scan](cci:1://file:///d:/Agentic_AI/Gssoc_Apache/datafusion/datafusion/catalog-listing/src/table.rs:697:4-832:5) can b

Re: [PR] ci: Consolidate Spark SQL test jobs to reduce CI time [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on PR #3271: URL: https://github.com/apache/datafusion-comet/pull/3271#issuecomment-3801929903 > Thanks @andygrove I def like the clean up, however I'm not sure how 15% fewer jobs are calculated, this PR has 118 checks completed vs #3266 (122 checks) 15% fewer Sp

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729411211 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729394863 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
drin commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2729423939 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(arg

Re: [PR] Various performance improvements [datafusion]

2026-01-26 Thread via GitHub
alamb-ghbot commented on PR #20013: URL: https://github.com/apache/datafusion/pull/20013#issuecomment-3801983889 🤖 `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gc

Re: [PR] Various performance improvements [datafusion]

2026-01-26 Thread via GitHub
alamb-ghbot commented on PR #20013: URL: https://github.com/apache/datafusion/pull/20013#issuecomment-3801983648 🤖: Benchmark completed Details ``` Comparing HEAD and various_groupby_perf Benchmark clickbench_extended.json ---

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
martin-g commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727377529 ## datafusion/proto-common/src/from_proto/mod.rs: ## @@ -1105,6 +1105,7 @@ impl TryFrom<&protobuf::JsonOptions> for JsonOptions { compression: compre

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
martin-g commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727432142 ## datafusion/datasource-json/src/source.rs: ## @@ -222,27 +257,69 @@ impl FileOpener for JsonOpener { } }; -

[PR] refactor: extract pushdown test utilities to shared module [datafusion]

2026-01-26 Thread via GitHub
adriangb opened a new pull request, #20010: URL: https://github.com/apache/datafusion/pull/20010 ## Summary Move TestSource, TestOpener, TestScanBuilder, OptimizationTest and related utilities from `filter_pushdown/util.rs` to a new shared `pushdown_utils.rs` module. This allows thes

Re: [PR] perf: improve performance of `array_remove`, `array_remove_n` and `array_remove_all` functions [datafusion]

2026-01-26 Thread via GitHub
lyne7-sc commented on PR #19996: URL: https://github.com/apache/datafusion/pull/19996#issuecomment-3799740500 Thanks for the review. Added a TODO, will validate performance for nested datatypes later on. -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] fix: Make `generate_series` return an empty set with invalid ranges [datafusion]

2026-01-26 Thread via GitHub
martin-g commented on code in PR #1: URL: https://github.com/apache/datafusion/pull/1#discussion_r2727724842 ## docs/source/library-user-guide/upgrading.md: ## @@ -154,6 +154,29 @@ The builder pattern is more efficient as it computes properties once during `bui Note:

Re: [PR] build: Fix docs workflow dependency resolution failure [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on PR #3275: URL: https://github.com/apache/datafusion-comet/pull/3275#issuecomment-3799769798 I need to test this more -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Automatically generate examples documentation adv (#19294) [datafusion]

2026-01-26 Thread via GitHub
cj-zhukov commented on PR #19750: URL: https://github.com/apache/datafusion/pull/19750#issuecomment-3799793562 @Jefffrey Thanks for the review! I’ve fixed the capitalization issues for group titles (e.g. Data IO, SQL Ops, UDF, etc.) by introducing explicit handling for common abbrevia

[PR] chore: [branch-0.13] Prepare 0.13.0 release [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove opened a new pull request, #3285: URL: https://github.com/apache/datafusion-comet/pull/3285 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] docs: Fix some broken / missing links in the DataFusion documentation [datafusion]

2026-01-26 Thread via GitHub
alamb merged PR #19958: URL: https://github.com/apache/datafusion/pull/19958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] perf: Optimize ArrowBytesViewMap with direct view access [datafusion]

2026-01-26 Thread via GitHub
Dandandan commented on code in PR #19975: URL: https://github.com/apache/datafusion/pull/19975#discussion_r2727452902 ## datafusion/physical-expr-common/src/binary_view_map.rs: ## @@ -250,52 +265,92 @@ where // step 2: insert each value into the set, if not already pres

Re: [PR] docs: Fix some broken / missing links in the DataFusion documentation [datafusion]

2026-01-26 Thread via GitHub
alamb commented on PR #19958: URL: https://github.com/apache/datafusion/pull/19958#issuecomment-3799398828 Thanks @Jefffrey and @martin-g -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] perf: Optimize ArrowBytesViewMap with direct view access [datafusion]

2026-01-26 Thread via GitHub
Dandandan commented on code in PR #19975: URL: https://github.com/apache/datafusion/pull/19975#discussion_r2727451991 ## datafusion/physical-expr-common/src/binary_view_map.rs: ## @@ -250,52 +265,92 @@ where // step 2: insert each value into the set, if not already pres

Re: [I] Building release JARs fails locally on macOS [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove closed issue #3261: Building release JARs fails locally on macOS URL: https://github.com/apache/datafusion-comet/issues/3261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] fix: Add JDK to Docker image for release build [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove merged PR #3262: URL: https://github.com/apache/datafusion-comet/pull/3262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[I] Physical plan proto roundtrip for null-valued scalar of a `Struct(Dict)` data type [datafusion]

2026-01-26 Thread via GitHub
dispanser opened a new issue, #20011: URL: https://github.com/apache/datafusion/issues/20011 ### Describe the bug We're currently upgrading from datafusion 45 to datafusion 46 :see-no-evil: and came across an issue related to dictionary ids during protobuf serde. Here's a small rep

Re: [I] Make PhysicalExprAdapterFactory::create fallible [datafusion]

2026-01-26 Thread via GitHub
adriangb commented on issue #19956: URL: https://github.com/apache/datafusion/issues/19956#issuecomment-3799641052 Ok let's see a PR diff and go from there! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Improve error message for wrong argument type in operators [datafusion]

2026-01-26 Thread via GitHub
Acfboy commented on issue #11250: URL: https://github.com/apache/datafusion/issues/11250#issuecomment-3799642470 @alamb Hi, I noticed that operator error handling doesn't share the same framework as scalar functions. The scattered type inference logic makes operator error messages less user

Re: [PR] fix: Normalize multi-line notes in GenerateDocs [datafusion-comet]

2026-01-26 Thread via GitHub
codecov-commenter commented on PR #3286: URL: https://github.com/apache/datafusion-comet/pull/3286#issuecomment-3800049806 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3286?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-26 Thread via GitHub
alamb commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3800035709 Also FYI @pepijnve -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-26 Thread via GitHub
alamb-ghbot commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3800035876 🤖 `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gc

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-26 Thread via GitHub
alamb commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3800035365 run benchmark tpcds -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] fix: Normalize multi-line notes in GenerateDocs [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove closed pull request #3286: fix: Normalize multi-line notes in GenerateDocs URL: https://github.com/apache/datafusion-comet/pull/3286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[PR] docs: fix bug in placement of prettier-ignore-end in generated docs [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove opened a new pull request, #3287: URL: https://github.com/apache/datafusion-comet/pull/3287 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[PR] Start 0.14.0 development [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove opened a new pull request, #3288: URL: https://github.com/apache/datafusion-comet/pull/3288 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-26 Thread via GitHub
pepijnve commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3800082400 It might be useful to add a microbenchmark to https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/benches/case_when.rs so we can compare before/after. I don't th

Re: [PR] Fix parsing of :: cast after parenthesized DEFAULT expression [datafusion-sqlparser-rs]

2026-01-26 Thread via GitHub
isaacparker0 commented on PR #2168: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2168#issuecomment-3800095077 @iffyio looks like you reviewed https://github.com/apache/datafusion-sqlparser-rs/pull/1927, perhaps you have the most context to review this PR? -- This is an au

Re: [PR] chore: [branch-0.13] Prepare 0.13.0 release [datafusion-comet]

2026-01-26 Thread via GitHub
codecov-commenter commented on PR #3285: URL: https://github.com/apache/datafusion-comet/pull/3285#issuecomment-3800118754 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3285?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
sdf-jkl commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2728255986 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(

Re: [PR] perf: optimize shuffle array element iteration with slice-based append [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove closed pull request #3222: perf: optimize shuffle array element iteration with slice-based append URL: https://github.com/apache/datafusion-comet/pull/3222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] perf: Improve shuffle performance with complex types [WIP] [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove opened a new pull request, #3289: URL: https://github.com/apache/datafusion-comet/pull/3289 ## Summary This PR optimizes native shuffle performance for complex types (arrays and nested structs). These optimizations reduce type dispatch overhead and improve cache locality du

Re: [PR] perf: optimize shuffle array element iteration with slice-based append [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on PR #3222: URL: https://github.com/apache/datafusion-comet/pull/3222#issuecomment-3800397851 replaced with https://github.com/apache/datafusion-comet/pull/3289 so I can work on all optimizations in a single PR for now -- This is an automated message from the Apache

Re: [PR] Blog post about CASE optimization [datafusion-site]

2026-01-26 Thread via GitHub
pepijnve commented on code in PR #122: URL: https://github.com/apache/datafusion-site/pull/122#discussion_r2726657227 ## content/blog/2025-11-11-datafusion_case.md: ## @@ -0,0 +1,377 @@ +--- +layout: post +title: Optimizing CASE Expression Evaluation +date: 2025-11-11 +author: P

[PR] chore(deps): bump actions/upload-artifact from 4 to 6 [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3280: URL: https://github.com/apache/datafusion-comet/pull/3280 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 6. Release notes Sourced from https://github.com/actions/upload-artifact/releases";>acti

Re: [PR] chore(deps): bump object_store from 0.12.4 to 0.13.0 in /native [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] commented on PR #3008: URL: https://github.com/apache/datafusion-comet/pull/3008#issuecomment-3798610760 Superseded by #3283. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] chore(deps): bump actions/cache from 4 to 5 [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3279: URL: https://github.com/apache/datafusion-comet/pull/3279 Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5. Release notes Sourced from https://github.com/actions/cache/releases";>actions/cache's releases. v

[PR] chore(deps): bump actions/download-artifact from 4 to 7 [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3281: URL: https://github.com/apache/datafusion-comet/pull/3281 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 7. Release notes Sourced from https://github.com/actions/download-artifact/releases

[PR] chore(deps): bump cc from 1.2.53 to 1.2.54 in /native [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3284: URL: https://github.com/apache/datafusion-comet/pull/3284 Bumps [cc](https://github.com/rust-lang/cc-rs) from 1.2.53 to 1.2.54. Release notes Sourced from https://github.com/rust-lang/cc-rs/releases";>cc's releases. cc-v1.2.54

Re: [PR] chore(deps): bump object_store from 0.12.4 to 0.13.0 in /native [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] closed pull request #3008: chore(deps): bump object_store from 0.12.4 to 0.13.0 in /native URL: https://github.com/apache/datafusion-comet/pull/3008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] chore(deps): bump object_store from 0.12.5 to 0.13.1 in /native [datafusion-comet]

2026-01-26 Thread via GitHub
dependabot[bot] opened a new pull request, #3283: URL: https://github.com/apache/datafusion-comet/pull/3283 Bumps [object_store](https://github.com/apache/arrow-rs-object-store) from 0.12.5 to 0.13.1. Changelog Sourced from https://github.com/apache/arrow-rs-object-store/blob/main/

Re: [PR] perf: Optimize ArrowBytesViewMap with direct view access [datafusion]

2026-01-26 Thread via GitHub
alamb-ghbot commented on PR #19975: URL: https://github.com/apache/datafusion/pull/19975#issuecomment-3798425933 🤖: Benchmark completed Details ``` Comparing HEAD and optimize-arrow-bytes-view-map Benchmark clickbench_extended.json --

Re: [PR] Do not require mut in memory reservation methods [datafusion]

2026-01-26 Thread via GitHub
gabotechs commented on PR #19759: URL: https://github.com/apache/datafusion/pull/19759#issuecomment-3798469783 benchmarks run tpch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [MySQL, Oracle] Parse optimizer hints [datafusion-sqlparser-rs]

2026-01-26 Thread via GitHub
xitep commented on code in PR #2162: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2162#discussion_r2727019526 ## src/parser/mod.rs: ## @@ -14005,6 +14015,59 @@ impl<'a> Parser<'a> { }) } +/// Parses an optional optimizer hint at the current to

Re: [PR] Improve documentation for ScalarUDFImpl::preimage [datafusion]

2026-01-26 Thread via GitHub
alamb commented on code in PR #20008: URL: https://github.com/apache/datafusion/pull/20008#discussion_r2727547648 ## datafusion/expr/src/udf.rs: ## @@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync { Ok(ExprSimplifyResult::Original(ar

[PR] Physical-level placeholders [datafusion]

2026-01-26 Thread via GitHub
LLDay opened a new pull request, #20009: URL: https://github.com/apache/datafusion/pull/20009 ## Rationale for this change This PR covers a part of #14342. Currently, DataFusion's support for placeholders is handled at the logical plan level. This requires re-planning every tim

Re: [I] Optimize the evaluation of `DATE_TRUNC() == )` when pushed down [datafusion]

2026-01-26 Thread via GitHub
alamb commented on issue #18319: URL: https://github.com/apache/datafusion/issues/18319#issuecomment-3799533539 > That being said, if the preimage API can be easily extended to other functions then I think preimage is a great name. For example, if a function can register its inverse (its "p

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727580997 ## datafusion/datasource-json/src/source.rs: ## @@ -222,27 +257,69 @@ impl FileOpener for JsonOpener { } }; -

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
martin-g commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727372394 ## datafusion/datasource-json/src/mod.rs: ## @@ -23,5 +23,7 @@ pub mod file_format; pub mod source; +pub mod utils; pub use file_format::*; +pub use utils:

Re: [PR] Snowflake: add Snowflake multi table insert support & add support for sample in subquery [datafusion-sqlparser-rs]

2026-01-26 Thread via GitHub
finchxxia commented on code in PR #2148: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2148#discussion_r2727465130 ## src/ast/dml.rs: ## @@ -90,92 +90,174 @@ pub struct Insert { /// /// [ClickHouse formats JSON insert](https://clickhouse.com/docs/en/int

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727585302 ## datafusion/proto-common/proto/datafusion_common.proto: ## @@ -469,6 +469,7 @@ message JsonOptions { CompressionTypeVariant compression = 1; // Compressio

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on PR #19924: URL: https://github.com/apache/datafusion/pull/19924#issuecomment-3799579050 Thank you @martin-g for review, i addressed your comments in latest commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Physical-level placeholders [datafusion]

2026-01-26 Thread via GitHub
LLDay commented on PR #20009: URL: https://github.com/apache/datafusion/pull/20009#issuecomment-3799577370 @askalt, could you review the PR as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727584418 ## datafusion/datasource-json/src/mod.rs: ## @@ -23,5 +23,7 @@ pub mod file_format; pub mod source; +pub mod utils; pub use file_format::*; +pub use uti

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727586033 ## datafusion/datasource-json/src/utils.rs: ## @@ -0,0 +1,389 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727587680 ## datafusion/datasource-json/src/utils.rs: ## @@ -0,0 +1,389 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Support JSON arrays reader/parse for datafusion [datafusion]

2026-01-26 Thread via GitHub
zhuqi-lucas commented on code in PR #19924: URL: https://github.com/apache/datafusion/pull/19924#discussion_r2727588899 ## datafusion/core/src/datasource/file_format/options.rs: ## @@ -442,7 +442,9 @@ impl<'a> AvroReadOptions<'a> { } } -/// Options that control the readi

Re: [PR] feat: Extract NDV (distinct_count) statistics from Parquet metadata [datafusion]

2026-01-26 Thread via GitHub
asolimando commented on code in PR #19957: URL: https://github.com/apache/datafusion/pull/19957#discussion_r2727672710 ## datafusion/datasource-parquet/src/metadata.rs: ## @@ -411,53 +415,6 @@ fn create_max_min_accs( (max_values, min_values) } -fn get_col_stats( Review

Re: [PR] feat: optimize CASE WHEN for divide-by-zero protection pattern [datafusion]

2026-01-26 Thread via GitHub
alamb-ghbot commented on PR #19994: URL: https://github.com/apache/datafusion/pull/19994#issuecomment-3800169538 🤖: Benchmark completed Details ``` Comparing HEAD and raymond_11570-optimize-case-when Benchmark tpcds_sf1.json -

Re: [PR] chore: [branch-0.13] Prepare 0.13.0 release [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on PR #3285: URL: https://github.com/apache/datafusion-comet/pull/3285#issuecomment-3800166868 > I'm comparing to the 0.12 counterpart PR https://github.com/apache/datafusion-comet/pull/2808/changes > > It looks like the difference in line count here is due to the

Re: [PR] docs: fix bug in placement of prettier-ignore-end in generated docs [datafusion-comet]

2026-01-26 Thread via GitHub
codecov-commenter commented on PR #3287: URL: https://github.com/apache/datafusion-comet/pull/3287#issuecomment-3800202584 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3287?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] spark `make_interval` need to have custom nullability [datafusion]

2026-01-26 Thread via GitHub
CuteChuanChuan commented on issue #19155: URL: https://github.com/apache/datafusion/issues/19155#issuecomment-3800252424 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: Start 0.14.0 development [datafusion-comet]

2026-01-26 Thread via GitHub
codecov-commenter commented on PR #3288: URL: https://github.com/apache/datafusion-comet/pull/3288#issuecomment-3800259434 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3288?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Blog post about CASE optimization [datafusion-site]

2026-01-26 Thread via GitHub
pepijnve commented on PR #122: URL: https://github.com/apache/datafusion-site/pull/122#issuecomment-3799441326 @alamb I've gone through the article entirely again. Would be good to get a fresh pair of eyes to review it again; I've been looking at this thing for too long. -- This is an au

Re: [PR] Fix datediff array length mismatch for dictionary-backed timestamps [datafusion-comet]

2026-01-26 Thread via GitHub
codecov-commenter commented on PR #3278: URL: https://github.com/apache/datafusion-comet/pull/3278#issuecomment-3799446613 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3278?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Speedup statistics_from_parquet_metadata (DataFusion side) [datafusion]

2026-01-26 Thread via GitHub
Dandandan closed issue #20005: Speedup statistics_from_parquet_metadata (DataFusion side) URL: https://github.com/apache/datafusion/issues/20005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Speedup statistics_from_parquet_metadata [datafusion]

2026-01-26 Thread via GitHub
Dandandan merged PR #20004: URL: https://github.com/apache/datafusion/pull/20004 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Speedup statistics_from_parquet_metadata [datafusion]

2026-01-26 Thread via GitHub
Dandandan commented on PR #20004: URL: https://github.com/apache/datafusion/pull/20004#issuecomment-3799505091 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Speedup statistics_from_parquet_metadata [datafusion]

2026-01-26 Thread via GitHub
Dandandan commented on PR #20004: URL: https://github.com/apache/datafusion/pull/20004#issuecomment-3799664277 > Thanks @Dandandan -- this looks good to me > > It also makes me wonder if there is more performance to be had in our statistics management -- for example, I notice that the

Re: [I] Parallelize `list_files_for_scan` [datafusion]

2026-01-26 Thread via GitHub
Tushar7012 commented on issue #19971: URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3799881475 Thanks for the guidance. I’ll implement the simpler approach: spawn tasks directly inside list_files_for_scan (following #19969), collect with JoinSet, and keep the API uncha

Re: [PR] build: Fix docs workflow dependency resolution failure [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove commented on PR #3275: URL: https://github.com/apache/datafusion-comet/pull/3275#issuecomment-3799897434 > I need to test this more I reverted to using maven and added a specific profile for generating docs. This is ready for review now. -- This is an automated message f

Re: [PR] docs: Mark native_comet scan as deprecated [datafusion-comet]

2026-01-26 Thread via GitHub
andygrove merged PR #3274: URL: https://github.com/apache/datafusion-comet/pull/3274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[PR] fix(expr): coerce literal arguments in return_field_from_args for UDFs [datafusion]

2026-01-26 Thread via GitHub
Trikooo opened a new pull request, #20012: URL: https://github.com/apache/datafusion/pull/20012 ## Which issue does this PR close? - Closes #19982. ## What changes are included in this PR? - Coercion of `ScalarValue`s: `scalar_arguments` now match the types of `arg_field

  1   2   3   4   >