Jefffrey commented on code in PR #19628:
URL: https://github.com/apache/datafusion/pull/19628#discussion_r2659502642
##
datafusion/spark/src/function/math/decimal_div.rs:
##
@@ -0,0 +1,434 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor
viirya opened a new pull request, #19635:
URL: https://github.com/apache/datafusion/pull/19635
## Which issue does this PR close?
- Closes #10583.
## Rationale for this change
## What changes are included in this PR?
## Are these changes tes
viirya commented on issue #10583:
URL: https://github.com/apache/datafusion/issues/10583#issuecomment-3707882918
@comphead I opened #19635 to fix this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
finchxxia opened a new pull request, #2148:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2148
1. I found that datafusion-sqlparser-rs cannot support [multi-table
insert](https://docs.snowflake.com/en/sql-reference/sql/insert-multi-table)
currently.
```
-- Unconditional
wForget commented on PR #2916:
URL:
https://github.com/apache/datafusion-comet/pull/2916#issuecomment-3707850222
@manuzhang The `cast_decimal_to_int32_up` function also has a similar issue.
Could you fix it as well?
Reproduce test case:
```
castTest(
generateD
Nachiket-Roy commented on PR #19633:
URL: https://github.com/apache/datafusion/pull/19633#issuecomment-3707853176
@ethan-tyler, please review this PR. Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
Dandandan commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707934127
run benchmarks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
alamb-ghbot commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707934201
๐ค `./gh_compare_branch.sh`
[gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh)
Running
Linux aal-dev 6.14.0-1018-gc
UBarney commented on code in PR #19602:
URL: https://github.com/apache/datafusion/pull/19602#discussion_r2659556702
##
datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs:
##
@@ -327,12 +329,24 @@ impl PhysicalExpr for HashTableLookupExpr {
Ok(false)
UBarney commented on code in PR #19602:
URL: https://github.com/apache/datafusion/pull/19602#discussion_r2659557626
##
datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs:
##
@@ -327,12 +329,24 @@ impl PhysicalExpr for HashTableLookupExpr {
Ok(false)
GaneshPatil7517 commented on PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#issuecomment-3708094406
@nuno-faria Please Review this...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
milenkovicm opened a new pull request, #1363:
URL: https://github.com/apache/datafusion-ballista/pull/1363
# Which issue does this PR close?
as part of datafusion upgrade we missed upgrading ballista versions as well
Closes #.
# Rationale for this change
# What ch
xudong963 commented on PR #18868:
URL: https://github.com/apache/datafusion/pull/18868#issuecomment-3708738865
Hey @alamb @adriangb, do you have time to review the PR? It would be sweet
to have it in 52.0.0
--
This is an automated message from the Apache Git Service.
To respond to the mes
Jefffrey merged PR #19515:
URL: https://github.com/apache/datafusion/pull/19515
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
Jefffrey commented on PR #19515:
URL: https://github.com/apache/datafusion/pull/19515#issuecomment-3708575538
Thanks @AlyAbdelmoneim, @xudong963 & @alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
Jefffrey closed issue #9706: `array_union` and `array_intersect` cannot handle
NULL columnar data
URL: https://github.com/apache/datafusion/issues/9706
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
Jefffrey merged PR #19415:
URL: https://github.com/apache/datafusion/pull/19415
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
Jefffrey commented on PR #19415:
URL: https://github.com/apache/datafusion/pull/19415#issuecomment-3708582155
Thanks for debugging and fixing this @feniljain
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
mattcuento commented on PR #1360:
URL:
https://github.com/apache/datafusion-ballista/pull/1360#issuecomment-3708756589
> there is one issue with docker build and looks like issue with disk space
failing other (not sure how to fix)
Thanks, looks like `substrait` doesn't run a high enough
coderfender commented on PR #2964:
URL:
https://github.com/apache/datafusion-comet/pull/2964#issuecomment-3708612474
```
String expressions
Jefffrey closed issue #7110: Split built in functions into "packages"
URL: https://github.com/apache/datafusion/issues/7110
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
Jefffrey commented on issue #7110:
URL: https://github.com/apache/datafusion/issues/7110#issuecomment-3708645005
Functions are now split into separate crates:
- https://github.com/apache/datafusion/tree/main/datafusion/functions
- https://github.com/apache/datafusion/tree/main/dataf
GaneshPatil7517 commented on PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#issuecomment-3708692079
there are 2 failing and 30 successful checks, let me solve this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
github-actions[bot] commented on PR #17815:
URL: https://github.com/apache/datafusion/pull/17815#issuecomment-3708711176
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
alamb-ghbot commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707953409
๐ค: Benchmark completed
Details
```
Comparing HEAD and speedup_accumulate2
Benchmark clickbench_extended.json
Dandandan opened a new issue, #19636:
URL: https://github.com/apache/datafusion/issues/19636
### Is your feature request related to a problem or challenge?
Currently, NullState allocates a boolean buffer for (group) accumulators
that potentially has null values.
### Describ
alamb-ghbot commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707957595
๐ค `./gh_compare_branch.sh`
[gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh)
Running
Linux aal-dev 6.14.0-1018-gc
Dandandan commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707957474
run benchmark tpch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
Dandandan commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707955700
Query 1 is consistently 15%-20% faster with this change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
Dandandan opened a new issue, #19637:
URL: https://github.com/apache/datafusion/issues/19637
### Is your feature request related to a problem or challenge?
_No response_
### Describe the solution you'd like
_No response_
### Describe alternatives you've considered
Dandandan commented on issue #19637:
URL: https://github.com/apache/datafusion/issues/19637#issuecomment-3707964081
Here it contains an AI-assisted PoC:
https://github.com/apache/datafusion/pull/19624 (need to iron out a type bug)
--
This is an automated message from the Apache Git Servic
alamb-ghbot commented on PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#issuecomment-3707967583
๐ค: Benchmark completed
Details
```
Comparing HEAD and speedup_accumulate2
Benchmark tpch_sf1.json
โโโ
kumarUjjawal commented on issue #19527:
URL: https://github.com/apache/datafusion/issues/19527#issuecomment-3707844141
I was loooking into this issue and attempted to make `Date + Interval`
return `Timestamp` instead of `Date`.
What I did:
1. **Type Coercion** (`expr-common/src
coderfender commented on issue #2986:
URL:
https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3707862395
@raushanprabhakar1 , you can run a local benchmark using a command like
below :
``` SPARK_GENERATE_BENCHMARK_FILES=1 make
benchmark-org.apache.spark.sql.benchmark.C
Jefffrey commented on code in PR #18137:
URL: https://github.com/apache/datafusion/pull/18137#discussion_r265967
##
datafusion/functions/src/string/concat.rs:
##
@@ -501,4 +645,120 @@ mod tests {
}
Ok(())
}
+
+#[test]
+fn test_concat_with_integ
milenkovicm commented on code in PR #1360:
URL:
https://github.com/apache/datafusion-ballista/pull/1360#discussion_r2659679232
##
ballista/scheduler/src/scheduler_server/grpc.rs:
##
@@ -873,4 +873,77 @@ mod test {
assert!(active_executors.is_empty());
Ok(())
milenkovicm commented on PR #1363:
URL:
https://github.com/apache/datafusion-ballista/pull/1363#issuecomment-3708105715
perhaps @martin-g or @danielhumanmod could help with review, we have missed
to increment ballista version as part of #1345
--
This is an automated message from the
Kimahriman opened a new pull request, #3035:
URL: https://github.com/apache/datafusion-comet/pull/3035
## Which issue does this PR close?
Related to #174, not full support so probably should keep that open (or open
new tickets specifically for column mapping and deletion vecto
Kimahriman commented on code in PR #3035:
URL: https://github.com/apache/datafusion-comet/pull/3035#discussion_r2659688631
##
native/core/Cargo.toml:
##
@@ -76,7 +76,7 @@ parking_lot = "0.12.5"
datafusion-comet-objectstore-hdfs = { path = "../hdfs", optional = true,
default-fe
Kimahriman commented on code in PR #3035:
URL: https://github.com/apache/datafusion-comet/pull/3035#discussion_r2659689056
##
spark/pom.xml:
##
@@ -112,6 +112,12 @@ under the License.
+
+ com.google.guava
+ failureaccess
+ 1.0.3
+
haohuaijin opened a new issue, #19638:
URL: https://github.com/apache/datafusion/issues/19638
### Is your feature request related to a problem or challenge?
current the `GroupedTopKAggregateStream` support two type of query
```sql
select id, max(time) from t group by id order by
haohuaijin commented on issue #19638:
URL: https://github.com/apache/datafusion/issues/19638#issuecomment-3708117313
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
Kimahriman commented on code in PR #3035:
URL: https://github.com/apache/datafusion-comet/pull/3035#discussion_r2659688631
##
native/core/Cargo.toml:
##
@@ -76,7 +76,7 @@ parking_lot = "0.12.5"
datafusion-comet-objectstore-hdfs = { path = "../hdfs", optional = true,
default-fe
Kimahriman commented on code in PR #3035:
URL: https://github.com/apache/datafusion-comet/pull/3035#discussion_r2659689056
##
spark/pom.xml:
##
@@ -112,6 +112,12 @@ under the License.
+
+ com.google.guava
+ failureaccess
+ 1.0.3
+
GaneshPatil7517 commented on issue #19638:
URL: https://github.com/apache/datafusion/issues/19638#issuecomment-3708907086
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
GaneshPatil7517 commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3708922989
hey @adriangb can i work on this...?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
adriangb commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3708924515
This is probably not a good issue to pick up. This is a draft PR for an
unproven idea.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
GaneshPatil7517 commented on PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#issuecomment-3708932402
Hi @nuno-faria & @adriangb ,
All checks have passed successfully on this PR. Iโm really excited to see it
get merged ๐
Kindly request you to review and approve when
adriangb opened a new issue, #19641:
URL: https://github.com/apache/datafusion/issues/19641
### Describe the bug
Because Hash and Eq take out separate locks it's possible the underlying
expression changes in between calls. Thus you get the same hash but not matches
for equality. I th
kumarUjjawal opened a new pull request, #19640:
URL: https://github.com/apache/datafusion/pull/19640
## Which issue does this PR close?
- Part of #19025.
## Rationale for this change
## What changes are included in this PR?
- Added Time64/Time32 sig
GaneshPatil7517 commented on issue #19638:
URL: https://github.com/apache/datafusion/issues/19638#issuecomment-3708936000
hey @nuno-faria Can i work on this'?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
alamb-ghbot commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3708960220
๐ค `./gh_compare_branch.sh`
[gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh)
Running
Linux aal-dev 6.14.0-1018-gc
adriangb commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3708960071
run benchmark tpch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
alamb-ghbot commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3708977753
๐ค: Benchmark completed
Details
```
Comparing HEAD and filter-pushdown-dynamic
Benchmark tpch_sf1.json
GaneshPatil7517 commented on PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#issuecomment-3708978142
ok ill work on that...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
GaneshPatil7517 commented on PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#issuecomment-3708986247
hey @adriangb please can you review it i updated it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
coderfender commented on PR #3017:
URL:
https://github.com/apache/datafusion-comet/pull/3017#issuecomment-3709013471
```
| Type | Before (main) | After (feature) | Improvement |
|--|---|-|-|
| i8 | 26.5 ยตs | 19.8 ยตs |
wForget merged PR #2916:
URL: https://github.com/apache/datafusion-comet/pull/2916
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@da
wForget commented on PR #2916:
URL:
https://github.com/apache/datafusion-comet/pull/2916#issuecomment-3709003717
Thanks @manuzhang, merged to main
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
wForget closed issue #2914: `cast_decimal_to_int16_down` formats decimal value
incorrectly
URL: https://github.com/apache/datafusion-comet/issues/2914
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
devanshu0987 commented on PR #19630:
URL: https://github.com/apache/datafusion/pull/19630#issuecomment-3709017147
Hi @Jefffrey, is there anything more I have to do here? What is the process
to merge it into the main?
--
This is an automated message from the Apache Git Service.
To respond
Jefffrey commented on PR #19630:
URL: https://github.com/apache/datafusion/pull/19630#issuecomment-3709027867
> Hi @Jefffrey, is there anything more I have to do here? What is the
process to merge it into the main?
We generally like to leave PRs up for a while after approval in case a
kazantsev-maksim commented on code in PR #19610:
URL: https://github.com/apache/datafusion/pull/19610#discussion_r2659596647
##
datafusion/spark/src/function/string/space.rs:
##
@@ -0,0 +1,245 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contribu
Brijesh-Thakkar commented on PR #19587:
URL: https://github.com/apache/datafusion/pull/19587#issuecomment-3708070847
@2010YOUY01 @Jefffrey Soorryy for wasting your time and efforts (I am new to
this repo and open source) this wont be repeated again
I will raise a new PR
--
This is an a
Brijesh-Thakkar commented on PR #19587:
URL: https://github.com/apache/datafusion/pull/19587#issuecomment-3708070138
@2010YOUY01 I wanted to ask, can i close this pr and raise a new one, till u
assign the issue to me, i will work on it and raise new pr, cuz i think i have
messed up in this
Brijesh-Thakkar closed pull request #19587: Fix NULL handling in
ScalarValue::partial_cmp (closes #19579)
URL: https://github.com/apache/datafusion/pull/19587
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
Jefffrey commented on PR #19587:
URL: https://github.com/apache/datafusion/pull/19587#issuecomment-3708080953
If you do intend to continue work on this, it would be preferable to keep
this PR open (even if just in draft mode) so we don't lose discussion context.
It's a commendable eff
Brijesh-Thakkar commented on issue #2986:
URL:
https://github.com/apache/datafusion-comet/issues/2986#issuecomment-3708080954
@coderfender How I run benchmarks, if i am doing PR in datafusion repo??
will this work there alsoo??
--
This is an automated message from the Apache Git Se
Brijesh-Thakkar commented on PR #19581:
URL: https://github.com/apache/datafusion/pull/19581#issuecomment-3708081881
@Jefffrey How can I run benchmarks locally??
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
Brijesh-Thakkar commented on PR #19587:
URL: https://github.com/apache/datafusion/pull/19587#issuecomment-3708083626
@Jefffrey okk I have reopen this PR and will work on this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
milenkovicm merged PR #1362:
URL: https://github.com/apache/datafusion-ballista/pull/1362
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubsc
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708209559
Just throwing ideas at the wall just in case it helps.
I feel like the fundamental problem (and I may be wrong about this) is that
filter pushdown has a rather large I/O an
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708272050
> The new parquet pushdown sort of does this IIUC, but at the physical
execution level - i.e. after the IO strategy is somewhat baked in
AFAIK the only thing along these li
tustvold commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708264117
> Has there been any attempts to keep track of filter selectivity and use
that to our advantage? For example we could track filter selectivity for each
filter and use that to:
milenkovicm commented on PR #1360:
URL:
https://github.com/apache/datafusion-ballista/pull/1360#issuecomment-3708276082
Also, could we gate substrait with config option, which could be on by
default?
Users not needing it could disable it at compile time.
--
This is an automated messa
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708278842
I thought that changed how the row selection was represented / evaluated but
did not actually move the filters out of the filter pushdown phase into the
apply after scan w/ proje
tustvold commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708276942
> After each filter for each RecordBatch is evaluated we re-order them and
possibly toss the ones with poor selectivity back into the scan phase.
I believe this is what htt
tustvold commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708285486
It depends what you mean by IO ๐
, if you mean fetching data from disk /
network, you are correct predicate pushdown being discussed here (late
materialization) does not influence
feniljain commented on code in PR #19622:
URL: https://github.com/apache/datafusion/pull/19622#discussion_r2659823598
##
datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs:
##
@@ -564,6 +566,7 @@ fn test_pushdown_through_aggregates_on_grouping_columns() {
// 2.
feniljain commented on code in PR #19622:
URL: https://github.com/apache/datafusion/pull/19622#discussion_r2659823598
##
datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs:
##
@@ -564,6 +566,7 @@ fn test_pushdown_through_aggregates_on_grouping_columns() {
// 2.
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708296137
> DF enabling filter pushdown will not influence the IO pattern to disk, and
therefore this cannot be responsible for the regression in performance
Ah maybe this is where m
tustvold commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708304747
Oh right, yes it will do that sorry, been years since I wrote that code (and
it looks like there's some new PushDecoder anyway that might change all of
this). So yes it will beha
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708307317
Makes sense there's probably more than one issue to tackle. I just imagine
that for the S3/GCS/etc. use case the extra I/O fetches would dominate, and
might even be the same for
HrithikSampson commented on issue #19637:
URL: https://github.com/apache/datafusion/issues/19637#issuecomment-3708307932
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
HrithikSampson commented on issue #19637:
URL: https://github.com/apache/datafusion/issues/19637#issuecomment-3708308350
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
renato2099 commented on PR #1320:
URL:
https://github.com/apache/datafusion-python/pull/1320#issuecomment-3708310578
Hi @kosiew ,
I have added some documentation notes. Let me know if this is sufficient,
otherwise I can add more explanations.
--
This is an automated message from t
viirya commented on code in PR #19635:
URL: https://github.com/apache/datafusion/pull/19635#discussion_r2659844657
##
datafusion/sqllogictest/test_files/joins.slt:
##
@@ -3516,7 +3516,6 @@ AS VALUES
query IT
SELECT t1_id, t1_name FROM join_test_left WHERE t1_id NOT IN (SELECT
Omega359 commented on issue #18115:
URL: https://github.com/apache/datafusion/issues/18115#issuecomment-3708188116
> I currently run this on a k8s cluster at my home, but this could run in
the cloud if someone wanted to pay for that. I plan on restricting access to
committers only. I am con
Jefffrey commented on PR #19581:
URL: https://github.com/apache/datafusion/pull/19581#issuecomment-3708153870
> @Jefffrey How can I run benchmarks locally??
See some examples of microbenchmarks here:
https://github.com/apache/datafusion/pull/19551
They should be able to be run
Dandandan commented on code in PR #19602:
URL: https://github.com/apache/datafusion/pull/19602#discussion_r2659721105
##
datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs:
##
@@ -327,12 +329,24 @@ impl PhysicalExpr for HashTableLookupExpr {
Ok(false)
milenkovicm commented on PR #1360:
URL:
https://github.com/apache/datafusion-ballista/pull/1360#issuecomment-3708132381
Maybe as a follow up we should put a bit more documentation around this and
example(s)
--
This is an automated message from the Apache Git Service.
To respond to the me
codecov-commenter commented on PR #3035:
URL:
https://github.com/apache/datafusion-comet/pull/3035#issuecomment-3708134680
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/3035?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
manuzhang commented on PR #2916:
URL:
https://github.com/apache/datafusion-comet/pull/2916#issuecomment-3708157211
@wForget Thanks for the good suggestion. I was struggling with test case so
I left the changes to `cast_decimal_to_int32_up` out. I've added tests for
`cast DecimalType(38,18)
adriangb commented on PR #19602:
URL: https://github.com/apache/datafusion/pull/19602#issuecomment-3708179452
It might be interesting to re-run
https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/#hash-join-dynamic-filters
and see if the numbers are even better now!
--
This is
andygrove commented on code in PR #19627:
URL: https://github.com/apache/datafusion/pull/19627#discussion_r2659849359
##
datafusion/spark/src/function/hash/murmur3_hash.rs:
##
@@ -0,0 +1,474 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributo
renato2099 commented on PR #1320:
URL:
https://github.com/apache/datafusion-python/pull/1320#issuecomment-3708330200
I am thinking that we could have a follow up on this path to be more
ergonomic though + a more future-proof API (non-breaking path). Basically, we
could introduce an enum-li
tustvold commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708332419
Yeah, it's a good point that whilst caching reduces the additional decode
costs for pushing down predicates, it doesn't eliminate the IO costs. That
being said in general you onl
viirya commented on code in PR #19635:
URL: https://github.com/apache/datafusion/pull/19635#discussion_r2659844657
##
datafusion/sqllogictest/test_files/joins.slt:
##
@@ -3516,7 +3516,6 @@ AS VALUES
query IT
SELECT t1_id, t1_name FROM join_test_left WHERE t1_id NOT IN (SELECT
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708389645
Okay yes I agree maybe I was being pessimistic ๐. In any case using what we
can from stats / metadata to set up the initial state / plan and then refining
it once we have runtime
adriangb commented on issue #3463:
URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708390932
> > arrow-rs at least exposed the selectivity of filters after each file is
read
>
> It is possible to provide an implementation of ArrowPredicate that tracks
this. IIRC t
1 - 100 of 140 matches
Mail list logo