Re: [I] Convert `BuiltInScalarFunction::{Rank, PercentRank, DenseRank}` to a user defined functions [datafusion]

2024-09-28 Thread via GitHub
jatin510 commented on issue #12648: URL: https://github.com/apache/datafusion/issues/12648#issuecomment-2380639008 If you don't mind . Can i work on this issue @hailelagi ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Bug with csv type inference [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #3174: URL: https://github.com/apache/datafusion/issues/3174#issuecomment-2380641588 > Perhaps we can use a function for match strings here instead of direct regex here, which seems more flexible for me. If you agree on that, I'll try to submit a pr to `arrow-rs` fo

[PR] Fill in missing `Debug` fields for `SessionState` [datafusion]

2024-09-28 Thread via GitHub
AnthonyZhOon opened a new pull request, #12663: URL: https://github.com/apache/datafusion/pull/12663 ## Which issue does this PR close? Uses #12555 to improve the `Debug` impl of `SessionState` ## Rationale for this change Now that the `Debug` for `SessionStateBuilder` co

[I] `REGEX_LIKE` and `REGEX_MATCH` don't support `LargeUtf8` type [datafusion]

2024-09-28 Thread via GitHub
goldmedal opened a new issue, #12664: URL: https://github.com/apache/datafusion/issues/12664 ### Describe the bug While working on #12415, I found `REGEX_LIKE` and `REGEX_MATCH` don't support `LargeUtf8` type. ### To Reproduce It can be reproduced by the following SQL

Re: [I] Implement `Debug` for `SessionStateBuilder [datafusion]

2024-09-28 Thread via GitHub
AnthonyZhOon commented on issue #12555: URL: https://github.com/apache/datafusion/issues/12555#issuecomment-2380685655 I think a pretty-print debug works okay using a `println!("{:#?}", state)`, most of the noise in the default was from printing long Vecs of scalar functions which have been

[PR] Add more functions for string sqllogictests [datafusion]

2024-09-28 Thread via GitHub
goldmedal opened a new pull request, #12665: URL: https://github.com/apache/datafusion/pull/12665 ## Which issue does this PR close? Part of #12415 ## Rationale for this change Finished the todo list at https://github.com/apache/datafusion/issues/12415#issuecomment-

Re: [I] Support DictionaryString for Regex matching operators [datafusion]

2024-09-28 Thread via GitHub
goldmedal commented on issue #12618: URL: https://github.com/apache/datafusion/issues/12618#issuecomment-2380708272 Related TODO item: - https://github.com/apache/datafusion/blob/c21d025df463ce623f9193c4b24d86141fce81ca/datafusion/sqllogictest/test_files/string/string_query.slt.part#L645

Re: [I] Simple Functions [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2380708925 > . It was an attempt to summarize why we need both: simpler types (https://github.com/apache/datafusion/issues/11513), more types (https://github.com/apache/datafusion/issues/126

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2380710741 > I’d love to hear if this design is sound, or if there are any potential pitfalls in how I’ve approached type mapping. One thing I found a bit confusing was `ExtensionType:

Re: [I] Convert `BuiltInScalarFunction::{Rank, PercentRank, DenseRank}` to a user defined functions [datafusion]

2024-09-28 Thread via GitHub
jatin510 commented on issue #12648: URL: https://github.com/apache/datafusion/issues/12648#issuecomment-2380834677 take Thanks @hailelagi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Add unhandled hook to PruningPredicate [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12606: URL: https://github.com/apache/datafusion/pull/12606#issuecomment-2380723388 > If I do the rewrite before PruningPredicate then I end up with just true. I don't see a rewrite in your example. I would have expected that you wrote something that substituted

[PR] docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine [datafusion]

2024-09-28 Thread via GitHub
andygrove opened a new pull request, #12666: URL: https://github.com/apache/datafusion/pull/12666 ## Which issue does this PR close? N/A ## Rationale for this change I have recently seen confusion online about whether DataFusion is an end user tool or

Re: [PR] docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine [datafusion]

2024-09-28 Thread via GitHub
andygrove commented on code in PR #12666: URL: https://github.com/apache/datafusion/pull/12666#discussion_r1779711265 ## README.md: ## @@ -42,14 +42,25 @@ DataFusion is an extensible query engine written in [Rust] that -uses [Apache Arrow] as its in-memory format. DataFusio

Re: [I] Convert `BuiltInScalarFunction::{Rank, PercentRank, DenseRank}` to a user defined functions [datafusion]

2024-09-28 Thread via GitHub
jatin510 commented on issue #12648: URL: https://github.com/apache/datafusion/issues/12648#issuecomment-2380837243 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Adds `WindowUDFImpl::reverse_expr`trait method + Support for `IGNORE NULLS` [datafusion]

2024-09-28 Thread via GitHub
jcsherin commented on PR #12662: URL: https://github.com/apache/datafusion/pull/12662#issuecomment-2380851014 > One thing I noticed is that there are no tests for this functionality (as in if we removed / broke the code no tests would fail). Is the idea that it will be test coverage added w

Re: [PR] chore: Fix jdk documentation for spark [datafusion-comet]

2024-09-28 Thread via GitHub
viirya commented on code in PR #979: URL: https://github.com/apache/datafusion-comet/pull/979#discussion_r1779728509 ## docs/source/user-guide/installation.md: ## @@ -28,9 +28,12 @@ Make sure the following requirements are met and software installed on your mach ## Requireme

[PR] build(deps): bump async-trait from 0.1.82 to 0.1.83 [datafusion-python]

2024-09-28 Thread via GitHub
dependabot[bot] opened a new pull request, #887: URL: https://github.com/apache/datafusion-python/pull/887 Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.82 to 0.1.83. Release notes Sourced from https://github.com/dtolnay/async-trait/releases";>async-trait's

[PR] build(deps): bump syn from 2.0.77 to 2.0.79 [datafusion-python]

2024-09-28 Thread via GitHub
dependabot[bot] opened a new pull request, #886: URL: https://github.com/apache/datafusion-python/pull/886 Bumps [syn](https://github.com/dtolnay/syn) from 2.0.77 to 2.0.79. Release notes Sourced from https://github.com/dtolnay/syn/releases";>syn's releases. 2.0.79 Fi

Re: [I] Don't error on unknown column when pruning if predicate can still be proven false [datafusion]

2024-09-28 Thread via GitHub
adriangb commented on issue #7869: URL: https://github.com/apache/datafusion/issues/7869#issuecomment-2380875028 I think I've gotten this working. I pass in a schema that only has the columns I have stats for. Other columns seem to be ignored. I also wrap the predicate with `() IS NOT FALSE

Re: [I] Convert `BuiltInScalarFunction::{Rank, PercentRank, DenseRank}` to a user defined functions [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on issue #12648: URL: https://github.com/apache/datafusion/issues/12648#issuecomment-2380892245 Convert BuiltInScalarFunction:: -> Convert BuiltInWindowFunction::{NthValue} -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Convert `BuiltInScalarFunction::{NthValue}` to a user defined functions [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on issue #12649: URL: https://github.com/apache/datafusion/issues/12649#issuecomment-2380892146 Convert BuiltInScalarFunction::{NthValue} -> Convert BuiltInWindowFunction::{NthValue} -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] generate udf docs from embedded code documentation [datafusion]

2024-09-28 Thread via GitHub
Omega359 opened a new pull request, #12668: URL: https://github.com/apache/datafusion/pull/12668 ## Which issue does this PR close? Closes #12432 ## Rationale for this change Improving documentation for UDF's to allow the documentation for a UDF to be used in multipl

[PR] [WIP] Aggregation fuzzer [datafusion]

2024-09-28 Thread via GitHub
Rachelint opened a new pull request, #12667: URL: https://github.com/apache/datafusion/pull/12667 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Add unhandled hook to PruningPredicate [datafusion]

2024-09-28 Thread via GitHub
adriangb commented on PR #12606: URL: https://github.com/apache/datafusion/pull/12606#issuecomment-2380876520 The point is the rewrite I want to do is `col = 'a'` or `col in ('a', 'b')` into `col_distinct = ANY(['a'])` or `col_distinct = ANY(['a', 'b'])` respectively (or maybe `exists (sele

Re: [PR] Minor: add partial assertion for skip aggregation probe [datafusion]

2024-09-28 Thread via GitHub
Rachelint commented on code in PR #12640: URL: https://github.com/apache/datafusion/pull/12640#discussion_r1779751387 ## datafusion/physical-plan/src/aggregates/row_hash.rs: ## @@ -1004,9 +1004,13 @@ impl GroupedHashAggregateStream { /// Updates skip aggregation probe state

Re: [PR] docs: Update DataFusion introduction to clarify that DataFusion does provide an "out of the box" query engine [datafusion]

2024-09-28 Thread via GitHub
timsaucer commented on PR #12666: URL: https://github.com/apache/datafusion/pull/12666#issuecomment-2380878461 In addition to the readme, should we update the site documentation? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Possible reproducer of schema metadata bug. [datafusion]

2024-09-28 Thread via GitHub
wiedld commented on code in PR #12658: URL: https://github.com/apache/datafusion/pull/12658#discussion_r1779746980 ## datafusion/common/src/dfschema.rs: ## @@ -861,7 +861,10 @@ impl TryFrom for DFSchema { impl From for SchemaRef { fn from(df_schema: DFSchema) -> Self { -

Re: [PR] Possible reproducer of schema metadata bug. [datafusion]

2024-09-28 Thread via GitHub
wiedld commented on code in PR #12658: URL: https://github.com/apache/datafusion/pull/12658#discussion_r1779746980 ## datafusion/common/src/dfschema.rs: ## @@ -861,7 +861,10 @@ impl TryFrom for DFSchema { impl From for SchemaRef { fn from(df_schema: DFSchema) -> Self { -

Re: [PR] Add additional regexp function regexp_count() [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on code in PR #12080: URL: https://github.com/apache/datafusion/pull/12080#discussion_r1779806550 ## datafusion/functions/src/regex/regexpcount.rs: ## @@ -0,0 +1,766 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] Add additional regexp function regexp_count() [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on code in PR #12080: URL: https://github.com/apache/datafusion/pull/12080#discussion_r1779806870 ## datafusion/functions/src/regex/regexpcount.rs: ## @@ -0,0 +1,766 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [I] Misleading query results due to likely internal parsing differences [datafusion]

2024-09-28 Thread via GitHub
Eason0729 commented on issue #12655: URL: https://github.com/apache/datafusion/issues/12655#issuecomment-2380643801 @doupache I am not planing to work on this, feel free to assign yourself. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Update introduction.md [datafusion]

2024-09-28 Thread via GitHub
liyuance commented on code in PR #12577: URL: https://github.com/apache/datafusion/pull/12577#discussion_r1779499966 ## docs/source/user-guide/introduction.md: ## @@ -96,6 +96,7 @@ Here are some active projects using DataFusion: - [Arroyo](https://github.com/ArroyoSystems/arr

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-09-28 Thread via GitHub
findepi commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2380660792 FYI i touched upon the topic of types on DataFusion meetup in Belgrade yesterday. The slides are here if anyone is interested: https://docs.google.com/presentation/d/1VW_JCGb

Re: [I] Extension Types [datafusion]

2024-09-28 Thread via GitHub
findepi commented on issue #12644: URL: https://github.com/apache/datafusion/issues/12644#issuecomment-2380660813 FYI i touched upon the topic of types on DataFusion meetup in Belgrade yesterday. The slides are here if anyone is interested: https://docs.google.com/presentation/d/1VW_JCGb

Re: [I] Simple Functions [datafusion]

2024-09-28 Thread via GitHub
findepi commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2380660829 FYI i touched upon the topic of types on DataFusion meetup in Belgrade yesterday. The slides are here if anyone is interested: https://docs.google.com/presentation/d/1VW_JCGb

Re: [I] Misleading query results due to likely internal parsing differences [datafusion]

2024-09-28 Thread via GitHub
doupache commented on issue #12655: URL: https://github.com/apache/datafusion/issues/12655#issuecomment-2380660512 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Fill in missing `Debug` fields for `SessionState` [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12663: URL: https://github.com/apache/datafusion/pull/12663#discussion_r1779658893 ## datafusion/core/src/execution/session_state.rs: ## @@ -174,27 +174,30 @@ pub struct SessionState { } impl Debug for SessionState { +/// Prefer having shor

Re: [PR] Adds `WindowUDFImpl::reverse_expr`trait method + Support for `IGNORE NULLS` [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12662: URL: https://github.com/apache/datafusion/pull/12662#discussion_r1779659320 ## datafusion/expr/src/udwf.rs: ## @@ -351,6 +359,24 @@ pub trait WindowUDFImpl: Debug + Send + Sync { fn coerce_types(&self, _arg_types: &[DataType]) -> Result

Re: [PR] Minor: Improve documentation on execution error handling [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12651: URL: https://github.com/apache/datafusion/pull/12651#issuecomment-2380715091 Thanks @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [Minor] Improve error message when bitwise_* operator takes wrong unsupported type [datafusion]

2024-09-28 Thread via GitHub
alamb merged PR #12646: URL: https://github.com/apache/datafusion/pull/12646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] [Minor] Improve error message when bitwise_* operator takes wrong unsupported type [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12646: URL: https://github.com/apache/datafusion/pull/12646#discussion_r1779658251 ## datafusion/physical-expr/src/expressions/binary/kernels.rs: ## @@ -24,7 +24,7 @@ use arrow::compute::kernels::bitwise::{ bitwise_xor, bitwise_xor_scalar, };

Re: [PR] [Minor] Improve error message when bitwise_* operator takes wrong unsupported type [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12646: URL: https://github.com/apache/datafusion/pull/12646#issuecomment-2380713825 Plan error is certainly better than internal error. We can consider switching to not implemented error as a follow on PR Thanks @dharanad -- This is an automated message from

Re: [PR] Refactor PrimitiveGroupValueBuilder to use BooleanBuilder [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12623: URL: https://github.com/apache/datafusion/pull/12623#discussion_r1779666579 ## datafusion/physical-plan/src/aggregates/group_values/group_value_row.rs: ## @@ -121,37 +145,64 @@ impl ArrayRowEq for PrimitiveGroupValueBuilder { }

Re: [PR] Refactor PrimitiveGroupValueBuilder to use BooleanBuilder [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12623: URL: https://github.com/apache/datafusion/pull/12623#discussion_r1779665405 ## datafusion/physical-plan/src/aggregates/group_values/group_column.rs: ## @@ -62,57 +64,60 @@ pub trait GroupColumn: Send + Sync { pub struct PrimitiveGroupValu

Re: [PR] Update introduction.md for `blaze` project [datafusion]

2024-09-28 Thread via GitHub
alamb merged PR #12577: URL: https://github.com/apache/datafusion/pull/12577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update introduction.md for `blaze` project [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12577: URL: https://github.com/apache/datafusion/pull/12577#issuecomment-2380721735 Thanks again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Support REPLACE INTO for INSERT statements [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12516: URL: https://github.com/apache/datafusion/pull/12516#issuecomment-2380721810 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [I] Support REPLACE INTO for INSERT statements [datafusion]

2024-09-28 Thread via GitHub
alamb closed issue #12515: Support REPLACE INTO for INSERT statements URL: https://github.com/apache/datafusion/issues/12515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Support REPLACE INTO for INSERT statements [datafusion]

2024-09-28 Thread via GitHub
alamb merged PR #12516: URL: https://github.com/apache/datafusion/pull/12516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] DataFrame parse_sql_expr does not handle aliases [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #12518: URL: https://github.com/apache/datafusion/issues/12518#issuecomment-2380722776 Thank you @Eason0729 -- we'll try and release a sqlparser release shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [I] Any plan to support JSON or JSONB? [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #7845: URL: https://github.com/apache/datafusion/issues/7845#issuecomment-2380722664 > When the row filtering was initially implemented we discussed keeping the decoded data cached but ended up deciding against it because it can potentially consume a lot of memory

Re: [I] Add min_by and max_by aggregate functions [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #12075: URL: https://github.com/apache/datafusion/issues/12075#issuecomment-2380722241 Thanks everyone -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Add additional regexp function regexp_count() [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on code in PR #12080: URL: https://github.com/apache/datafusion/pull/12080#discussion_r1779807201 ## docs/source/user-guide/sql/scalar_functions.md: ## @@ -1302,13 +1302,38 @@ Apache DataFusion uses a [PCRE-like] regular expression [syntax] (minus support fo

Re: [PR] Add additional regexp function regexp_count() [datafusion]

2024-09-28 Thread via GitHub
Omega359 commented on PR #12080: URL: https://github.com/apache/datafusion/pull/12080#issuecomment-2381006980 Thanks for the very nice PR @xinlifoobar ! I'll try and take the time to do a full review of this PR next week if no one beats me to it. -- This is an automated message from the A

Re: [PR] chore: remove XFAIL from passing tests [datafusion-python]

2024-09-28 Thread via GitHub
andygrove merged PR #884: URL: https://github.com/apache/datafusion-python/pull/884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

[PR] Minor: Change LiteralGuarantee try_new to new [datafusion]

2024-09-28 Thread via GitHub
pgwhalen opened a new pull request, #12669: URL: https://github.com/apache/datafusion/pull/12669 ## Which issue does this PR close? Minor change so no associated issue - hopefully that's okay, I see comparable "minor" PRs get merged. ## Rationale for this change With htt

Re: [PR] Minor: Change LiteralGuarantee try_new to new [datafusion]

2024-09-28 Thread via GitHub
pgwhalen commented on PR #12669: URL: https://github.com/apache/datafusion/pull/12669#issuecomment-2381018881 Looks like @alamb might be best to review - thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [MINOR]: Use take_arrays in repartition [datafusion]

2024-09-28 Thread via GitHub
jayzhan211 merged PR #12657: URL: https://github.com/apache/datafusion/pull/12657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] `take_arrays` in repartition is not renamed [datafusion]

2024-09-28 Thread via GitHub
jayzhan211 closed pull request #12659: `take_arrays` in repartition is not renamed URL: https://github.com/apache/datafusion/pull/12659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Possible reproducer of schema metadata bug. [datafusion]

2024-09-28 Thread via GitHub
alamb commented on code in PR #12658: URL: https://github.com/apache/datafusion/pull/12658#discussion_r1779436348 ## datafusion/common/src/dfschema.rs: ## @@ -861,7 +861,10 @@ impl TryFrom for DFSchema { impl From for SchemaRef { fn from(df_schema: DFSchema) -> Self { -

Re: [I] Bug with csv type inference [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #3174: URL: https://github.com/apache/datafusion/issues/3174#issuecomment-2380570078 I wonder if you could use a length check on the characters too (it wouldn't be perfect, but it might catch more) As in you could bound the number of characters matched by the

[PR] Minor: Add github link to code that was upstreamed [datafusion]

2024-09-28 Thread via GitHub
alamb opened a new pull request, #12660: URL: https://github.com/apache/datafusion/pull/12660 ## Which issue does this PR close? Follow on to https://github.com/apache/datafusion/pull/12654 ## Rationale for this change @mustafasrepo ported `take_arrays` upstream to a

[PR] Fix jdk documentation for spark [datafusion-comet]

2024-09-28 Thread via GitHub
adi-kmt opened a new pull request, #979: URL: https://github.com/apache/datafusion-comet/pull/979 ## Which issue does this PR close? Closes Closes #742 ## Rationale for this change <-- --> ## What changes are included in this PR? Changed the `installation.md

Re: [I] Bug with csv type inference [datafusion]

2024-09-28 Thread via GitHub
CookiePieWw commented on issue #3174: URL: https://github.com/apache/datafusion/issues/3174#issuecomment-2380588637 > I wonder if you could use a length check on the characters too (it wouldn't be perfect, but it might catch more) Thanks for your suggestion! But if we want to precisel

[PR] Implements `WindowUDFImpl::reverse_expr` + Support for `IGNORE NULLS` [datafusion]

2024-09-28 Thread via GitHub
jcsherin opened a new pull request, #12662: URL: https://github.com/apache/datafusion/pull/12662 ## Which issue does this PR close? Closes #12661. ## Rationale for this change 1. To allow a user-defined window function to customize its implementation for

Re: [PR] [Minor] Improve error message when bitwise_* operator takes wrong unsupported type [datafusion]

2024-09-28 Thread via GitHub
dharanad commented on code in PR #12646: URL: https://github.com/apache/datafusion/pull/12646#discussion_r1779463677 ## datafusion/physical-expr/src/expressions/binary/kernels.rs: ## @@ -24,7 +24,7 @@ use arrow::compute::kernels::bitwise::{ bitwise_xor, bitwise_xor_scalar,

Re: [PR] [MINOR]: Use take_arrays in repartition [datafusion]

2024-09-28 Thread via GitHub
jayzhan211 commented on PR #12657: URL: https://github.com/apache/datafusion/pull/12657#issuecomment-2380549100 Thanks @doupache -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Possible reproducer of schema metadata bug. [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12658: URL: https://github.com/apache/datafusion/pull/12658#issuecomment-2380551118 > Note: CI is failing due to [this change](https://github.com/apache/datafusion/commit/6553fafc5c320e13fb177977b5a78a91c2ac2fc1) merged today. @akurmustafa I believe this was fi

Re: [I] Move `kurtosis_pop` to `datafusion-functions-extra` and out ofcore [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #12625: URL: https://github.com/apache/datafusion/issues/12625#issuecomment-2380568837 Thanks @dharanad and @Weijun-H ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Implement physical optimizer rule for common subexpression elimination [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #12599: URL: https://github.com/apache/datafusion/issues/12599#issuecomment-2380591388 @petertoth and several others have invested significant time making the CSE pass in the logical optimizer fast and efficient The logical rewrite code is here https://github

Re: [I] The rule `common_sub_expression_eliminate` removes non-duplicate expressions [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #4887: URL: https://github.com/apache/datafusion/issues/4887#issuecomment-2380591703 I believe we have fixed this as part of the extensive common subexpr refactoring. Can someone check if this is still an issue? -- This is an automated message from the Apache Git

Re: [I] Fix the test failures related to the logical optimizer. [datafusion]

2024-09-28 Thread via GitHub
alamb commented on issue #4685: URL: https://github.com/apache/datafusion/issues/4685#issuecomment-2380591873 We have enabled checking by default in https://github.com/apache/datafusion/pull/6265 so I think we can claim this ticket is complete -- This is an automated message from the Apa

Re: [I] Fix the test failures related to the logical optimizer. [datafusion]

2024-09-28 Thread via GitHub
alamb closed issue #4685: Fix the test failures related to the logical optimizer. URL: https://github.com/apache/datafusion/issues/4685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] The rule `common_sub_expression_eliminate` removes non-duplicate expressions [datafusion]

2024-09-28 Thread via GitHub
alamb closed issue #4887: The rule `common_sub_expression_eliminate` removes non-duplicate expressions URL: https://github.com/apache/datafusion/issues/4887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Add binary_view to string_view coercion [datafusion]

2024-09-28 Thread via GitHub
alamb merged PR #12643: URL: https://github.com/apache/datafusion/pull/12643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Support Binary --> String coercion for StringView/BinaryView in `LIKE` [datafusion]

2024-09-28 Thread via GitHub
alamb closed issue #12500: Support Binary --> String coercion for StringView/BinaryView in `LIKE` URL: https://github.com/apache/datafusion/issues/12500 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Implement physical optimizer rule for common subexpression elimination [datafusion]

2024-09-28 Thread via GitHub
peter-toth commented on issue #12599: URL: https://github.com/apache/datafusion/issues/12599#issuecomment-2380597798 Thanks @alamb for pinging me! I was a bit busy lately, but I'm happy to look into this issue next week or so. -- This is an automated message from the Apache Git Service. T

Re: [PR] Support REPLACE INTO for INSERT statements [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12516: URL: https://github.com/apache/datafusion/pull/12516#issuecomment-2380597852 Merged up to get CI fix / hopefully pass -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Add binary_view to string_view coercion [datafusion]

2024-09-28 Thread via GitHub
alamb commented on PR #12643: URL: https://github.com/apache/datafusion/pull/12643#issuecomment-2380597645 Thanks again @doupache -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Row groups are read out of order or with completely different values [datafusion]

2024-09-28 Thread via GitHub
twitu commented on issue #10572: URL: https://github.com/apache/datafusion/issues/10572#issuecomment-2380598250 # Sorted row groups not read in-order when FILTER clause is used I'm seeing another issue with datafusion streaming sorted records from disk with a filter query in the SQL.

[I] Add `WindowUDFImpl::reverse_expr` + support for `IGNORE NULLS` [datafusion]

2024-09-28 Thread via GitHub
jcsherin opened a new issue, #12661: URL: https://github.com/apache/datafusion/issues/12661 ### Is your feature request related to a problem or challenge? 1. Reverse a user-defined window function which computes the same result but in the reverse order. 2. Add support for `IGNORE N

Re: [I] Misleading query results due to likely internal parsing differences [datafusion]

2024-09-28 Thread via GitHub
doupache commented on issue #12655: URL: https://github.com/apache/datafusion/issues/12655#issuecomment-2380614455 @Eason0729 Are you working on this issue? If you are not planning on working on this, I would like to work on it 😊 -- This is an automated message from th

Re: [PR] `take_arrays` in repartition is not renamed [datafusion]

2024-09-28 Thread via GitHub
doupache commented on PR #12659: URL: https://github.com/apache/datafusion/pull/12659#issuecomment-2380498998 Hi @jayzhan211 I also try to fix same issue in https://github.com/apache/datafusion/pull/12657 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Refactor PrimitiveGroupValueBuilder to use BooleanBuilder [datafusion]

2024-09-28 Thread via GitHub
jayzhan211 commented on code in PR #12623: URL: https://github.com/apache/datafusion/pull/12623#discussion_r1779845711 ## datafusion/physical-plan/src/aggregates/group_values/group_column.rs: ## @@ -62,57 +64,60 @@ pub trait GroupColumn: Send + Sync { pub struct PrimitiveGrou

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-09-28 Thread via GitHub
jayzhan211 commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2381051491 > Just becase I know JSON columns are represented as Strings internally, I think they should never be treated as Strings unless the user explicitly requests such a conversion

Re: [PR] Simplify associative expressions with references [datafusion]

2024-09-28 Thread via GitHub
github-actions[bot] commented on PR #11733: URL: https://github.com/apache/datafusion/pull/11733#issuecomment-2381063766 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Change `FileSinkConfig.object_store_url` from `ObjectStoreUrl` to `Url` [datafusion]

2024-09-28 Thread via GitHub
github-actions[bot] commented on PR #11705: URL: https://github.com/apache/datafusion/pull/11705#issuecomment-2381063783 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or