andygrove commented on PR #3890:
URL:
https://github.com/apache/datafusion-comet/pull/3890#issuecomment-4180969473
@kazuyukitanimura @martin-g This PR was created using the audit skill that
you reviewed
--
This is an automated message from the Apache Git Service.
To respond to the messag
comphead commented on issue #1630:
URL:
https://github.com/apache/datafusion-comet/issues/1630#issuecomment-4181025858
Comet sets the value from proto for FIRST/LAST
```
AggregateExprBuilder::new(Arc::new(func), vec![child])
.schema(schema)
mbutrovich commented on PR #3629:
URL:
https://github.com/apache/datafusion-comet/pull/3629#issuecomment-4181039136
> Most of tests fail on, checking it:
>
> ```
> Comet native panic: panicked at
/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-ex
comphead closed issue #281: CI: Add spark expression coverage to build process
URL: https://github.com/apache/datafusion-comet/issues/281
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
comphead opened a new pull request, #3891:
URL: https://github.com/apache/datafusion-comet/pull/3891
## Which issue does this PR close?
Closes https://github.com/apache/datafusion-comet/issues/1630.
## Rationale for this change
## Summary
comphead commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3030960460
##
spark/src/main/scala/org/apache/comet/serde/math.scala:
##
@@ -19,13 +19,19 @@
package org.apache.comet.serde
-import org.apache.spark.sql.catalyst.ex
comphead commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3030960460
##
spark/src/main/scala/org/apache/comet/serde/math.scala:
##
@@ -19,13 +19,19 @@
package org.apache.comet.serde
-import org.apache.spark.sql.catalyst.ex
jja725 opened a new issue, #1539:
URL: https://github.com/apache/datafusion-ballista/issues/1539
## Is your feature request related to a problem or challenge?
Ballista currently stores shuffle data on local executor disks and serves it
via Arrow Flight between executors. This creates
andygrove commented on code in PR #3890:
URL: https://github.com/apache/datafusion-comet/pull/3890#discussion_r3031003372
##
docs/source/contributor-guide/expression-audit-log.md:
##
@@ -0,0 +1,32 @@
+
+
+# Expression Audit Log
Review Comment:
I think it is important to keep
karuppayya opened a new issue, #3894:
URL: https://github.com/apache/datafusion-comet/issues/3894
### Describe the bug
Comet's Iceberg reflection path calls `table.operations().current()`.
The current implementation uses `getDeclaredMethod("current")` on the
concrete `operations` r
parthchandra commented on code in PR #3865:
URL: https://github.com/apache/datafusion-comet/pull/3865#discussion_r3031017576
##
native/spark-expr/src/utils.rs:
##
@@ -174,6 +174,19 @@ fn datetime_cast_err(value: i64) -> ArrowError {
))
}
+fn resolve_local_datetime(tz: &T
karuppayya opened a new pull request, #3895:
URL: https://github.com/apache/datafusion-comet/pull/3895
## Which issue does this PR close?
Closes #3894.
## Rationale for this change
Fix `NoSuchMethodException` from Iceberg Reflection
## What changes are included i
kazuyukitanimura commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3031032898
##
spark/src/main/scala/org/apache/comet/serde/math.scala:
##
@@ -19,13 +19,19 @@
package org.apache.comet.serde
-import org.apache.spark.sql.cat
mbutrovich commented on issue #3882:
URL:
https://github.com/apache/datafusion-comet/issues/3882#issuecomment-4181293267
I mentioned this to @andygrove the other day, but applying compression (lz4,
snappy, etc.) at the batch granularity is likely too small to get all their
benefits. Iβd be
neilconway commented on code in PR #21238:
URL: https://github.com/apache/datafusion/pull/21238#discussion_r3031035588
##
datafusion/functions/src/string/split_part.rs:
##
@@ -220,6 +231,190 @@ fn rsplit_nth<'a>(string: &'a str, delimiter: &str, n:
usize) -> Option<&'a str>
kazuyukitanimura commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3031050891
##
spark/src/main/scala/org/apache/comet/serde/math.scala:
##
@@ -19,13 +19,19 @@
package org.apache.comet.serde
-import org.apache.spark.sql.cat
karuppayya commented on issue #3882:
URL:
https://github.com/apache/datafusion-comet/issues/3882#issuecomment-4181214277
@andygrove thanks for creating this issue
Adding some more details
| Records | Comet Shuffle Write | Standard Shuffle Write | Bytes/Record
(Comet) | Byt
xiedeyantu opened a new pull request, #18:
URL: https://github.com/apache/datafusion-testing/pull/18
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubsc
zhuqi-lucas commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4181485812
Strange β I tested locally (release build, --partitions 12 and --partitions
16) and found:
1. **Plans are identical** between main and PR for all 4 queries (SPM β
DataSour
adriangb commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4181490730
So I guess we need to update the benchmarks?
We can also open a PR with no real changes to run the benchmarks.
--
This is an automated message from the Apache Git Service.
T
Dandandan commented on PR #21327:
URL: https://github.com/apache/datafusion/pull/21327#issuecomment-4181880820
I like how small the PR is!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
Shekharrajak commented on PR #3546:
URL:
https://github.com/apache/datafusion-comet/pull/3546#issuecomment-4181847420
Please trigger the CI checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
adriangbot commented on PR #21328:
URL: https://github.com/apache/datafusion/pull/21328#issuecomment-4181822009
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21328#issuecomment-4181752523)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
martin-g commented on PR #3890:
URL:
https://github.com/apache/datafusion-comet/pull/3890#issuecomment-4181988026
> @kazuyukitanimura @martin-g This PR was created using the audit skill that
you reviewed
Really cool!
--
This is an automated message from the Apache Git Service.
To
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182166368
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
viirya commented on PR #3876:
URL:
https://github.com/apache/datafusion-comet/pull/3876#issuecomment-4182168951
> The PR looks good to me, thanks @viirya may I ask you to add sql tests
like in #3891
Thanks for review. I will try to add sql tests.
--
This is an automated message fr
zhuqi-lucas commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182101292
Update: found the root cause of Q1/Q3 regression and a fix.
**Root cause**: `SortPreservingMergeExec` uses `spawn_buffered(stream, 1)` β
only 1 batch prefetched per partiti
Dandandan commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364
run benchmarks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
zhuqi-lucas commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276
run benchmarks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182141591
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182140849
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
JeelRajodiya opened a new pull request, #21331:
URL: https://github.com/apache/datafusion/pull/21331
**Rationale**
The `datafusion-spark` crate is missing the `encode` function. Spark's
[`encode(expr,
charset)`](https://spark.apache.org/docs/latest/api/sql/index.html#encode)
convert
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182189680
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182191418
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182108390
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182124608
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182117219)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182108145
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182108274
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182158396
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182158896
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21330#issuecomment-4182100364)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182157972
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182117219)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
Dandandan opened a new issue, #21329:
URL: https://github.com/apache/datafusion/issues/21329
### Is your feature request related to a problem or challenge?
SortPreservingMergeExec::execute() eagerly calls execute() on all input
partitions and spawns buffered tasks immediately, befor
Dandandan opened a new pull request, #21330:
URL: https://github.com/apache/datafusion/pull/21330
## Which issue does this PR close?
## Rationale for this change
Currently, `RepartitionExec::execute()` eagerly calls
`ensure_input_streams_initialized()` which opens all i
Dandandan commented on PR #21328:
URL: https://github.com/apache/datafusion/pull/21328#issuecomment-4182041172
cc @neilconway
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
Dandandan commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182039448
run benchmarks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182047049
Benchmark for [this
request](https://github.com/apache/datafusion/pull/21330#issuecomment-4182039448)
failed.
Last 20 lines of output:
Click to expand
```
* [
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182047804
Benchmark for [this
request](https://github.com/apache/datafusion/pull/21330#issuecomment-4182039448)
failed.
Last 20 lines of output:
Click to expand
```
* [
adriangbot commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182047170
Benchmark for [this
request](https://github.com/apache/datafusion/pull/21330#issuecomment-4182039448)
failed.
Last 20 lines of output:
Click to expand
```
* [
zhuqi-lucas commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182117219
run benchmark sort_pushdown_sorted
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182144813
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4182144805
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4182138276)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
Dandandan commented on PR #21330:
URL: https://github.com/apache/datafusion/pull/21330#issuecomment-4182198021
Ok - this is currently slower for plans with limited concurrency (tpcds),
perhaps slightly better for `clickbench_partitioned` I think we can wait until
morsel-splitting and see if
Dandandan closed pull request #21330: Defer task spawning in RepartitionExec to
first poll
URL: https://github.com/apache/datafusion/pull/21330
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
andygrove commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3028427152
##
spark/src/main/scala/org/apache/comet/serde/math.scala:
##
@@ -213,24 +219,6 @@ object CometAbs extends CometExpressionSerde[Abs] with
MathExprBase {
andygrove commented on PR #3757:
URL:
https://github.com/apache/datafusion-comet/pull/3757#issuecomment-4178421165
@YutaLin could you run `cargo fmt --all` to fix lint failures
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
adriangb commented on PR #21068:
URL: https://github.com/apache/datafusion/pull/21068#issuecomment-4178283016
Thanks @kosiew ! Feel free to merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
andygrove commented on code in PR #3875:
URL: https://github.com/apache/datafusion-comet/pull/3875#discussion_r3028384790
##
spark/src/main/scala/org/apache/comet/serde/structs.scala:
##
@@ -105,53 +105,37 @@ object CometGetArrayStructFields extends
CometExpressionSerde[GetArra
andygrove commented on code in PR #3875:
URL: https://github.com/apache/datafusion-comet/pull/3875#discussion_r3028390361
##
spark/src/main/scala/org/apache/comet/serde/structs.scala:
##
@@ -105,53 +105,37 @@ object CometGetArrayStructFields extends
CometExpressionSerde[GetArra
andygrove commented on code in PR #3849:
URL: https://github.com/apache/datafusion-comet/pull/3849#discussion_r3028420858
##
spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala:
##
@@ -1346,7 +1345,7 @@ class CometExpressionSuite extends CometTestBase with
Adaptive
andygrove commented on code in PR #3848:
URL: https://github.com/apache/datafusion-comet/pull/3848#discussion_r3028473095
##
spark/src/test/resources/sql-tests/expressions/math/ceil.sql:
##
@@ -15,7 +15,6 @@
-- specific language governing permissions and limitations
-- under t
andygrove commented on PR #3757:
URL:
https://github.com/apache/datafusion-comet/pull/3757#issuecomment-4178458553
Some test suggestions for edge cases that could reveal incompatibilities.
The main risk is that Comet casts all inputs to f64 before accumulation, while
Spark stores original
hcrosse commented on code in PR #1537:
URL:
https://github.com/apache/datafusion-ballista/pull/1537#discussion_r3028594421
##
ballista/core/src/utils.rs:
##
@@ -159,42 +159,74 @@ pub fn default_config_producer() -> SessionConfig {
SessionConfig::new_with_ballista()
}
-/
andygrove closed issue #1894: How to properly measure off-heap memory usage for
Comet?
URL: https://github.com/apache/datafusion-comet/issues/1894
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
andygrove commented on issue #1894:
URL:
https://github.com/apache/datafusion-comet/issues/1894#issuecomment-4178475112
Here are a few approaches depending on what you're trying to measure:
### Measuring Comet's off-heap memory usage
**Option 1: Tracing with jemalloc (recommend
hcrosse commented on code in PR #1537:
URL:
https://github.com/apache/datafusion-ballista/pull/1537#discussion_r3028568797
##
ballista/core/src/execution_plans/shuffle_writer.rs:
##
@@ -255,96 +252,114 @@ impl ShuffleWriterExec {
}
Some(Parti
hcrosse commented on code in PR #1537:
URL:
https://github.com/apache/datafusion-ballista/pull/1537#discussion_r3028649892
##
benchmarks/src/bin/shuffle_bench.rs:
##
@@ -240,19 +275,53 @@ async fn benchmark_sort_shuffle(
output_partitions,
),
conf
hcrosse commented on code in PR #1537:
URL:
https://github.com/apache/datafusion-ballista/pull/1537#discussion_r3028654374
##
ballista/core/src/execution_plans/shuffle_writer.rs:
##
@@ -214,14 +214,12 @@ impl ShuffleWriterExec {
match output_partitioning {
comphead commented on PR #21075:
URL: https://github.com/apache/datafusion/pull/21075#issuecomment-4178577955
Thanks @xiedeyantu I'll take a look this week, would be super useful for
users and also for regression to have internal microbenchmarks, similar to
`datafusion/core/benches/push_dow
jonahgao commented on PR #21312:
URL: https://github.com/apache/datafusion/pull/21312#issuecomment-4178579399
Thank you @2010YOUY01 for the review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
jonahgao merged PR #21312:
URL: https://github.com/apache/datafusion/pull/21312
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
adriangb opened a new pull request, #21322:
URL: https://github.com/apache/datafusion/pull/21322
## Which issue does this PR close?
N/A β new feature
## Rationale for this change
DuckDB provides a [`cast_to_type(expression,
reference)`](https://duckdb.org/docs/current/sq
buraksenn commented on PR #21082:
URL: https://github.com/apache/datafusion/pull/21082#issuecomment-4178598612
> lgtm, thanks @buraksenn
>
> Might be worth double-checking a couple of things:
>
> 1. `current_time()`: Europe/London in December should be UTC+0, so the
+1h off
zhuqi-lucas commented on PR #21266:
URL: https://github.com/apache/datafusion/pull/21266#issuecomment-4175208553
@adriangb Updated! Much simpler now β just `tpchgen --parts=3` + `mv` to
rename files. No datafusion-cli needed.
The reversed naming produces clear benchmark differences (r
Shekharrajak commented on PR #2994:
URL:
https://github.com/apache/datafusion-comet/pull/2994#issuecomment-4175310971
Found issue in CI checks :
https://github.com/apache/datafusion-comet/issues/3881
--
This is an automated message from the Apache Git Service.
To respond to the message,
Shekharrajak opened a new issue, #3881:
URL: https://github.com/apache/datafusion-comet/issues/3881
### What is the problem the feature request solves?
Comet does not support ExistenceJoin, causing incorrect results for
correlated IN subqueries combined with OR on Spark 4.0. Adding na
Dandandan commented on PR #21240:
URL: https://github.com/apache/datafusion/pull/21240#issuecomment-4175309833
> > In principle self.right.execute only builds the stream - it shouldn't do
any "actual" work, only the setup
>
> Is that true in practice? e.g.,
>
> * CoalescePartit
tobixdev commented on code in PR #21291:
URL: https://github.com/apache/datafusion/pull/21291#discussion_r3026627995
##
datafusion/common/src/types/canonical_extensions/bool8.rs:
##
@@ -0,0 +1,133 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more cont
xiedeyantu opened a new issue, #21316:
URL: https://github.com/apache/datafusion/issues/21316
### Describe the bug
When `GROUPING SETS` contains duplicate grouping lists, DataFusion
incorrectly collapses them during execution. The internal `grouping_id` only
encodes the semantic null
manuzhang commented on code in PR #3753:
URL: https://github.com/apache/datafusion-comet/pull/3753#discussion_r3027533477
##
native/core/src/execution/jni_api.rs:
##
@@ -778,33 +778,31 @@ pub unsafe extern "system" fn
Java_org_apache_comet_Native_writeSortedFileNative
comp
xiedeyantu commented on issue #21316:
URL: https://github.com/apache/datafusion/issues/21316#issuecomment-4176949882
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
manuzhang commented on code in PR #3753:
URL: https://github.com/apache/datafusion-comet/pull/3753#discussion_r3027533477
##
native/core/src/execution/jni_api.rs:
##
@@ -778,33 +778,31 @@ pub unsafe extern "system" fn
Java_org_apache_comet_Native_writeSortedFileNative
comp
xiedeyantu commented on issue #21316:
URL: https://github.com/apache/datafusion/issues/21316#issuecomment-4177000259
Hi @alamb @neilconway, I have logged an issueβ here to describe this bug. If
you have time, please review it. Thanks!
--
This is an automated message from the Apache Git Se
xudong963 commented on code in PR #162:
URL: https://github.com/apache/datafusion-site/pull/162#discussion_r3026588135
##
content/blog/2026-03-25-datafusion-53.0.0.md:
##
@@ -0,0 +1,403 @@
+---
+layout: post
+title: Apache DataFusion 53.0.0 Released
+date: 2026-03-25
Review Com
blaginin merged PR #21134:
URL: https://github.com/apache/datafusion/pull/21134
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
SubhamSinghal commented on code in PR #21303:
URL: https://github.com/apache/datafusion/pull/21303#discussion_r3027721541
##
datafusion/optimizer/src/eliminate_outer_join.rs:
##
@@ -436,6 +454,221 @@ mod tests {
")
}
+#[test]
+fn eliminate_left_with_in_li
andygrove opened a new issue, #3882:
URL: https://github.com/apache/datafusion-comet/issues/3882
### Describe the bug
The current shuffle format writes each batch using the Arrow IPC Stream
format, writing a single batch per stream instance, which means that the schema
is encoded for
zhuqi-lucas opened a new issue, #21317:
URL: https://github.com/apache/datafusion/issues/21317
**Is your feature request related to a problem or challenge?**
Currently sort pushdown reorders **files** by min/max statistics to achieve
sort elimination. But within each file, row groups
zhuqi-lucas commented on code in PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#discussion_r3027914806
##
datafusion/datasource-parquet/src/source.rs:
##
@@ -811,11 +819,6 @@ impl FileSource for ParquetSource {
Ok(SortOrderPushdownResult::Inexact {
neilconway commented on PR #21238:
URL: https://github.com/apache/datafusion/pull/21238#issuecomment-4177731241
@martin-g Any interest in reviewing this PR? It's a follow-on to the initial
`split_work` work that was done in #21119
--
This is an automated message from the Apache Git Servic
alamb commented on issue #21231:
URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4177839220
From what I can tell, the core issue is that CaseWhen currently assumes
input column references can be discovered by finding built-in `Column` physical
exprs. That is not true for
andygrove commented on code in PR #3781:
URL: https://github.com/apache/datafusion-comet/pull/3781#discussion_r3028168645
##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala:
##
@@ -981,9 +981,12 @@ abstract class ParquetReadSuite extends CometTestBase {
adriangb merged PR #21059:
URL: https://github.com/apache/datafusion/pull/21059
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
adriangb closed issue #21065: Dynamic filters sometimes do not get pushed down
through aggregations
URL: https://github.com/apache/datafusion/issues/21065
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
SubhamSinghal opened a new issue, #21320:
URL: https://github.com/apache/datafusion/issues/21320
### Is your feature request related to a problem or challenge?
The `PropagateEmptyRelation` optimizer rule correctly handles inner joins,
semi joins, and anti joins when one or both side
adriangb commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4178037387
run benchmark sort_pushdown_sorted
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4178061675
π€ Benchmark running (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4178037387)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB) | `Linux
bench-
adriangbot commented on PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#issuecomment-4178115627
π€ Benchmark completed (GKE) |
[trigger](https://github.com/apache/datafusion/pull/21182#issuecomment-4178037387)
**Instance:** `c4a-highmem-16` (12 vCPU / 65 GiB)
CPU
andygrove merged PR #3880:
URL: https://github.com/apache/datafusion-comet/pull/3880
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@
andygrove commented on PR #3781:
URL:
https://github.com/apache/datafusion-comet/pull/3781#issuecomment-4178630280
@parthchandra @comphead @mbutrovich Thanks for the feedback so far. I
simplified this PR so that `auto` mode now chooses `native_datafusion`
**instead of** `native_iceberg_com
xiedeyantu commented on PR #21058:
URL: https://github.com/apache/datafusion/pull/21058#issuecomment-4178647113
> Sorry for the delayed response, @xiedeyantu !
>
> Thanks for revising this. I'm a bit concerned by the overhead here; we are
added a `UInt32` column to _every_ query with
1 - 100 of 391 matches
Mail list logo