LiaCastaneda commented on code in PR #17529:
URL: https://github.com/apache/datafusion/pull/17529#discussion_r2401473105
##
datafusion/physical-plan/src/joins/hash_join/information_passing.rs:
##
@@ -0,0 +1,612 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+/
pepijnve commented on code in PR #17884:
URL: https://github.com/apache/datafusion/pull/17884#discussion_r2401508429
##
datafusion/physical-expr/src/expressions/in_list.rs:
##
@@ -1453,31 +1464,31 @@ mod tests {
let sql_string = fmt_sql(expr.as_ref()).to_string();
comphead commented on issue #2452:
URL:
https://github.com/apache/datafusion-comet/issues/2452#issuecomment-3358647566
```
ExternalSorterMerge[9]#1169(can spill: false) consumed 5.0 MB, peak 10.0
MB,
ExternalSorterMerge[9]#1171(can spill: false) consumed 866.9 KB, peak 10.0
MB,
kosiew commented on code in PR #17875:
URL: https://github.com/apache/datafusion/pull/17875#discussion_r2401562160
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -727,6 +727,98 @@ async fn parquet_explain_analyze() {
assert_contains!(&formatted, "row_groups_pruned_s
alamb commented on issue #16620:
URL: https://github.com/apache/datafusion/issues/16620#issuecomment-3365481120
@debajyoti-truefoundry -- could you possible provide a self-contained
reproducer (e.g. the data you used for the queries above, or some syntehtic
version that has the same proper
chenkovsky commented on PR #16161:
URL: https://github.com/apache/datafusion/pull/16161#issuecomment-3367555224
> Hi, I ran my code using this branch and unfortunately it did not solve my
issue (https://github.com/apache/datafusion/issues/16590).
hi,could you please push your code, th
parthchandra commented on issue #2520:
URL:
https://github.com/apache/datafusion-comet/issues/2520#issuecomment-3366404371
Also, FWIW, that specific method has been historically problematic because
it is one of the rare places where Comet uses a private Spark method. Because
it is private,
alamb commented on PR #17906:
URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367082486
🤖 `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubun
andygrove commented on PR #2521:
URL:
https://github.com/apache/datafusion-comet/pull/2521#issuecomment-3367237253
Still experimenting...
https://github.com/user-attachments/assets/951feeba-e2d9-4c5d-b8db-5fac93f1cbc2";
/>
--
This is an automated message from the Apache Git Se
mach-kernel opened a new pull request, #17911:
URL: https://github.com/apache/datafusion/pull/17911
## Which issue does this PR close?
#15718, though I can file a new bug report. Should be easy to on a
round-tripped plan against a Hive-partitioned table.
## Rationale for this c
LucaCappelletti94 commented on PR #2037:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3364727585
Is there anything else needed to be done for this PR? @iffyio
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
Dandandan commented on issue #17267:
URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3364471092
> From my tests SMJ requires more memory than HJ with a large number of
partitions (like 32), so switching from HJ to SMJ makes memory situation worse
not better in that case.
comphead opened a new pull request, #2500:
URL: https://github.com/apache/datafusion-comet/pull/2500
## Which issue does this PR close?
Closes #.
## Rationale for this change
## What changes are included in this PR?
## How are these changes
Smith-Cruise opened a new issue, #2055:
URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2055
I'm implementing my own database now, my sql's grammar likes `starrocks`,
`doris`, or `oceanbase`
I want to add `CATALOGS` and `SWITCH` keywords. Can I contribute code
directly?
albertoRamon commented on issue #17895:
URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365402091
I think that this issue is coming from Rust installation ( I had the same
issue)
As part of Rust install , automatically try to install "Visual Build Tools"
and if y
LucaCappelletti94 opened a new issue, #2053:
URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2053
Hi,
I need to implement traits describing for instance `CheckConstraint` and
`UniqueConstraint` and so on for various crates, including sqlparser.
These objects are c
ctsk opened a new issue, #17897:
URL: https://github.com/apache/datafusion/issues/17897
### Describe the bug
MinMaxBytesAccumulator's update_batch function has runtime that quadratic in
the number of groups accumulated: On each update_batch call, the implementation
allocates a new ve
LucaCappelletti94 opened a new pull request, #2054:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2054
As per title, this PR moved all of the struct variants out of the
`TableConstraint` enum. This is done to allow for implementing traits in
dependent crates which only apply t
alamb commented on PR #17888:
URL: https://github.com/apache/datafusion/pull/17888#issuecomment-3367208762
Ok, the tests are now looking good enough to test with the new thrift decoder
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
alamb commented on code in PR #17907:
URL: https://github.com/apache/datafusion/pull/17907#discussion_r2403374748
##
datafusion/common/src/scalar/mod.rs:
##
@@ -231,6 +233,10 @@ pub enum ScalarValue {
Float32(Option),
/// 64bit float
Float64(Option),
+/// 32bi
andygrove commented on PR #17902:
URL: https://github.com/apache/datafusion/pull/17902#issuecomment-3366089143
Actually, I have a better idea ... closing this for now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
adriangb merged PR #17905:
URL: https://github.com/apache/datafusion/pull/17905
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
vbarua opened a new pull request, #17909:
URL: https://github.com/apache/datafusion/pull/17909
## Which issue does this PR close?
- Closes https://github.com/apache/datafusion/issues/16590.
## Rationale for this change
## What changes are included in this
Jefffrey opened a new issue, #17912:
URL: https://github.com/apache/datafusion/issues/17912
### Is your feature request related to a problem or challenge?
https://github.com/apache/datafusion/blob/daeb6597a0c7344735460bb2dce13879fd89d7bd/datafusion/expr/src/logical_plan/plan.rs#L2546-
parthchandra commented on code in PR #2447:
URL: https://github.com/apache/datafusion-comet/pull/2447#discussion_r2403518726
##
common/src/main/java/org/apache/comet/parquet/CometFileKeyUnwrapper.java:
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
Jefffrey commented on code in PR #17871:
URL: https://github.com/apache/datafusion/pull/17871#discussion_r2403660919
##
datafusion/spark/src/function/aggregate/avg.rs:
##
@@ -0,0 +1,351 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor lic
kosiew commented on PR #1256:
URL:
https://github.com/apache/datafusion-python/pull/1256#issuecomment-3364588624
@timsaucer,
here's an example that shows the NotImplementedError
examples/table_capsule_failure.py
```python
"""Demonstrate how missing __datafusion_table_
pepijnve commented on PR #17419:
URL: https://github.com/apache/datafusion/pull/17419#issuecomment-3364630280
> `samply record -- target/profiling/deps/sql_planner-1adcb045f71bd635
--bench physical_plan_clickbench_q43`
I'll try samply next time. Was that on your macOS machine or on
Dandandan opened a new issue, #17894:
URL: https://github.com/apache/datafusion/issues/17894
### Describe the bug
Currently, the limit can be pushed down:
```
--GlobalLimitExec
BoundedWindowAggExec: wdw=
--SortPreservingMergeExec: [c1@2 ASC NULLS LAST, c2@3 ASC
milenkovicm commented on PR #1324:
URL:
https://github.com/apache/datafusion-ballista/pull/1324#issuecomment-3364708273
thanks @kevinjqliu
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
Jefffrey commented on issue #14633:
URL: https://github.com/apache/datafusion/issues/14633#issuecomment-3364438632
Filter clause for aggregations is now supported in generic dialect, see
#17807 (and its references #16516 and #15719)
e.g. on main:
```sh
datafusion-cli$ cargo
dependabot[bot] opened a new pull request, #17896:
URL: https://github.com/apache/datafusion/pull/17896
Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action)
from 2.62.16 to 2.62.17.
Release notes
Sourced from https://github.com/taiki-e/install-action/releases";
HeWhoHeWho opened a new issue, #17895:
URL: https://github.com/apache/datafusion/issues/17895
Hi,
I have been trying to get `datafusion-cli` installed via `cargo install
datafusion-cli` but kept getting the error:
`CMake Error at CMakeLists.txt:13 (project): Failed to run MSBuild c
kosiew commented on PR #17518:
URL: https://github.com/apache/datafusion/pull/17518#issuecomment-3364631364
@crystalxyz,
> I left some comments about the design here, but if you have other
priorities, let me know and I'm happy to help on this as well!
By all means. I would love
alamb opened a new issue, #17899:
URL: https://github.com/apache/datafusion/issues/17899
### Is your feature request related to a problem or challenge?
As reported in https://github.com/apache/datafusion/issues/16620 by
@debajyoti-truefoundry, evaluting `DISTINCT ON` results in a quer
alamb commented on issue #16620:
URL: https://github.com/apache/datafusion/issues/16620#issuecomment-3365500499
Filed
- https://github.com/apache/datafusion/issues/17899
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HeWhoHeWho commented on issue #17895:
URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365518176
Thanks for the tip! Let me try that out in a bit.
Just curious, did you also add `path/to/bin/CMake.exe` to Path? Because I
did that, otherwise it will trigger error `m
dependabot[bot] opened a new pull request, #24:
URL: https://github.com/apache/datafusion-sandbox/pull/24
Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action)
from 2.61.8 to 2.62.17.
Release notes
Sourced from https://github.com/taiki-e/install-action/releases"
dependabot[bot] closed pull request #23: chore(deps): bump
taiki-e/install-action from 2.61.8 to 2.62.16
URL: https://github.com/apache/datafusion-sandbox/pull/23
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
dependabot[bot] commented on PR #23:
URL:
https://github.com/apache/datafusion-sandbox/pull/23#issuecomment-3365346334
Superseded by #24.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
lfdversluis opened a new issue, #2520:
URL: https://github.com/apache/datafusion-comet/issues/2520
### Describe the bug
Following up from #2504, I compiled comet in CentOS 7 to build it against
glibc 2.17. Running a simple query, tasks are getting lost due to some metrics
function no
cfmcgrady commented on issue #2478:
URL:
https://github.com/apache/datafusion-comet/issues/2478#issuecomment-3345240223
> FYI:
https://datafusion.apache.org/user-guide/sql/scalar_functions.html#array-reverse
@wForget Thanks for pointing this out. I’ll submit a new PR mapping the
Spa
dependabot[bot] commented on PR #24:
URL:
https://github.com/apache/datafusion-sandbox/pull/24#issuecomment-3365346309
### Labels
The following labels could not be found: `auto-dependencies`. Please create
it before Dependabot can add it to a pull request.
Please fix the a
duongcongtoai commented on issue #17446:
URL: https://github.com/apache/datafusion/issues/17446#issuecomment-3366427064
```
Benchmark 1: uv run polar.py sample-1m.parquet
Time (mean ± σ): 258.6 ms ± 32.2 ms[User: 514.8 ms, System: 218.1
ms]
Range (min … max): 238.5 ms
alamb commented on issue #17897:
URL: https://github.com/apache/datafusion/issues/17897#issuecomment-3366527369
That makes sense
One thing we could do is to reuse the allocation (put a `mut buffer:
Vec` field on)
That does still result in non trivial memory and overhead howeve
adriangb commented on PR #17905:
URL: https://github.com/apache/datafusion/pull/17905#issuecomment-3367030105
Thank you @alamb!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
andygrove commented on issue #2452:
URL:
https://github.com/apache/datafusion-comet/issues/2452#issuecomment-3367039203
The last chart was incorrect.
https://github.com/user-attachments/assets/fc942115-8d17-4b5f-8533-507501450d46";
/>
Here is where we run into issues:
`
alamb commented on PR #17907:
URL: https://github.com/apache/datafusion/pull/17907#issuecomment-3367045874
Why does this PR has substantially more lines that the original PR?
This PR
https://github.com/user-attachments/assets/f4adf867-1ad4-4b64-a563-2b7a765b5d18";
/>
Origina
HeWhoHeWho commented on issue #17895:
URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3365688893
Double checked my MS Build Tools 2022, all relevant tools have been
installed with the path manually added just now - still, build failed.
This is the full _long_ log o
comphead commented on issue #17267:
URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3366176816
SMJ may require more memory and additional expenses on sorting, so the HJ
would be faster in most cases. But SMJ is more robust on limited memory if
implemented correctly.
alamb merged PR #17837:
URL: https://github.com/apache/datafusion/pull/17837
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
duongcongtoai commented on issue #17446:
URL: https://github.com/apache/datafusion/issues/17446#issuecomment-3366781389
looks related: https://github.com/apache/datafusion/issues/17445
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
comphead commented on code in PR #2521:
URL: https://github.com/apache/datafusion-comet/pull/2521#discussion_r2402894515
##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -87,9 +87,9 @@ class CometExecIterator(
CometSparkSessionExtensions.getCometMem
alamb commented on PR #17890:
URL: https://github.com/apache/datafusion/pull/17890#issuecomment-3366826040
🚀 -- than you @AdamGS
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
buraksenn commented on issue #17899:
URL: https://github.com/apache/datafusion/issues/17899#issuecomment-3366854137
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
timsaucer commented on code in PR #1259:
URL:
https://github.com/apache/datafusion-python/pull/1259#discussion_r2401989522
##
.github/workflows/build.yml:
##
@@ -127,7 +127,7 @@ jobs:
build-macos-x86_64:
needs: [generate-license]
name: Mac x86_64
-runs-on: maco
alamb commented on PR #17898:
URL: https://github.com/apache/datafusion/pull/17898#issuecomment-3365752397
🤖: Benchmark completed
Details
```
group case_improvements main
- -
Dandandan closed issue #17894: Limit is not pushed down SortPreservingMergeExec
URL: https://github.com/apache/datafusion/issues/17894
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
alamb merged PR #17896:
URL: https://github.com/apache/datafusion/pull/17896
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on PR #17890:
URL: https://github.com/apache/datafusion/pull/17890#issuecomment-3366258560
Updated to get changes from https://github.com/apache/datafusion/pull/17892
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
comphead commented on PR #2515:
URL:
https://github.com/apache/datafusion-comet/pull/2515#issuecomment-3366308072
> > Thanks @andygrove
> > Its been a while since `backtraces` introduced in DF and I was thinking
to replace Comet errors with DFs? So they would have backtrace capabilities
AdamGS commented on PR #17907:
URL: https://github.com/apache/datafusion/pull/17907#issuecomment-3367099381
Ok now it seems better, its not a perfect match because some of the changes
were based upon https://github.com/apache/datafusion/pull/17459, which wasn't
backported.
--
This is an
alamb commented on PR #17906:
URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367176309
🤖 `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubun
alamb commented on PR #17906:
URL: https://github.com/apache/datafusion/pull/17906#issuecomment-3367176235
🤖: Benchmark completed
Details
```
Comparing HEAD and feature_nl-join-projection-push-down
Benchmark tpch_mem_sf1.json
andygrove commented on PR #2521:
URL:
https://github.com/apache/datafusion-comet/pull/2521#issuecomment-3367189413
moving to draft while I work on the Python scripts
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
dqkqd commented on PR #17852:
URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367791885
Thank @Jefffrey
I added the tests to `window.slt`, however, this test passed even without
the change.
I verified with `datafusion-cli` and it failed.
I'm not sure what's
HeWhoHeWho commented on issue #17895:
URL: https://github.com/apache/datafusion/issues/17895#issuecomment-3367901824
Very unfortunate that NASM is prohibited from installing in my Windows work
machine.
Is there any workaround I can navigate through the installation process
without t
duongcongtoai opened a new pull request, #17915:
URL: https://github.com/apache/datafusion/pull/17915
## Which issue does this PR close?
- Closes #.
## Rationale for this change
## What changes are included in this PR?
## Are these changes t
iffyio commented on PR #2037:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3367962986
@LucaCappelletti94 could you take a look at [this
comment](https://github.com/apache/datafusion-sqlparser-rs/pull/2037#discussion_r2362178757)?
--
This is an automated m
LucaCappelletti94 commented on PR #2037:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2037#issuecomment-3367965425
Hi @iffyio, I replied to it here:
https://github.com/apache/datafusion-sqlparser-rs/pull/2037#discussion_r2379300179
--
This is an automated message from the
alamb commented on PR #17841:
URL: https://github.com/apache/datafusion/pull/17841#issuecomment-3366268009
Needs update to new prost that is coming in arrow 57. Let's close this for
now
See
- https://github.com/apache/datafusion/pull/17888
--
This is an automated message from th
Jefffrey commented on PR #17852:
URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367794464
> Thank @Jefffrey
>
> I added the tests to `window.slt`, however, this test passed even without
the change. I verified with `datafusion-cli` and it failed. I'm not sure what's
parthchandra commented on code in PR #2521:
URL: https://github.com/apache/datafusion-comet/pull/2521#discussion_r2402994029
##
native/core/src/execution/memory_pools/logging_pool.rs:
##
@@ -0,0 +1,85 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more
danielhumanmod commented on PR #1309:
URL:
https://github.com/apache/datafusion-ballista/pull/1309#issuecomment-3367946791
> Sorry, for late reply @danielhumanmod I'm not quite sure, i guess all the
metrics are collected at the scheduler side, so scheduler should have it all
once job finis
Jefffrey commented on code in PR #17871:
URL: https://github.com/apache/datafusion/pull/17871#discussion_r2403661517
##
datafusion/spark/src/function/aggregate/avg.rs:
##
@@ -0,0 +1,337 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor lic
Jefffrey commented on PR #17852:
URL: https://github.com/apache/datafusion/pull/17852#issuecomment-3367817219
Raised #17914 btw just for tracking (don't know if this is happening in
other SLT files, but I assume so)
--
This is an automated message from the Apache Git Service.
To respond t
iffyio commented on code in PR #2054:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2054#discussion_r2403806157
##
src/ast/table_constraints/check_constraint.rs:
##
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
Review Comment:
vegarsti commented on PR #17891:
URL: https://github.com/apache/datafusion/pull/17891#issuecomment-3367964880
> Thanks @vegarsti I would love to see if there any performance
degradations, you can find benches in the project.
>
> Maybe we can have a separate test for this issues?
iffyio commented on code in PR #2024:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2024#discussion_r2403803942
##
src/dialect/mod.rs:
##
@@ -596,6 +596,20 @@ pub trait Dialect: Debug + Any {
false
}
+/// Returns true if the dialect supports Co
LucaCappelletti94 commented on code in PR #2054:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/2054#discussion_r2403807532
##
src/ast/table_constraints/check_constraint.rs:
##
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
Review
80 matches
Mail list logo