github
Thread
Date
Earlier messages
Messages by Thread
[I] Comet JVM UDF implementations cannot be created in `spark` module [datafusion-comet]
via GitHub
[PR] Credential provioder support [datafusion-comet]
via GitHub
Re: [PR] Credential provioder support [datafusion-comet]
via GitHub
[I] panic: capacity overflow in generate_series / range with large i64 arguments [datafusion]
via GitHub
[I] panic on SET datafusion.runtime.* when value ends with non-ASCII byte (split_at char boundary) [datafusion]
via GitHub
[PR] Fix panic on deep compound identifiers [datafusion]
via GitHub
Re: [I] [Feature] Support Spark expression: make_dt_interval [datafusion-comet]
via GitHub
[I] Slow logical planning (>1s) for regexp_like with nested {N} quantifiers [datafusion]
via GitHub
[I] panic in SQL planner: compound identifier with ≥6 parts hits unreachable .unwrap() in identifier.rs [datafusion]
via GitHub
[PR] test: probe shaded Arrow imports for CometUDF impl in spark module [datafusion-comet]
via GitHub
Re: [PR] test: probe shaded Arrow imports for CometUDF impl in spark module [IGNORE] [datafusion-comet]
via GitHub
[PR] chore: Switch over stray old-style string builder in `substr` [datafusion]
via GitHub
[PR] test: add DateFormat smoke test for JVM UDF framework [datafusion-comet]
via GitHub
[PR] perf: Optimize `overlay` with new string builder [datafusion]
via GitHub
[I] Optimize overlay using `append_with` [datafusion]
via GitHub
Re: [I] Optimize overlay using `append_with` [datafusion]
via GitHub
Re: [I] [Feature] Support Spark expression: make_timestamp [datafusion-comet]
via GitHub
[I] Flaky CI with `datafusion-ffi` [datafusion]
via GitHub
[I] Credential Provider Support [datafusion-comet]
via GitHub
[PR] Support Spark expression: local_timestamp [datafusion-comet]
via GitHub
[PR] docs: move changelogs from dev/ to docs/source/changelog/ [datafusion-comet]
via GitHub
[I] Drop support for Spark 3.4 [datafusion-comet]
via GitHub
[PR] minor: change log level for few statements [datafusion-ballista]
via GitHub
Re: [PR] minor: change log level for few statements [datafusion-ballista]
via GitHub
Re: [PR] proto: add proto converter reference to PhysicalExtensionCodec trait [datafusion]
via GitHub
[PR] Add SQL as a category in breaking API change policy [datafusion]
via GitHub
Re: [PR] Add SQL as a category in breaking API change policy [datafusion]
via GitHub
[PR] feat: disable Comet by default when CometShuffleManager is not registered [datafusion-comet]
via GitHub
Re: [I] ci: use ubuntu-slim where applicable [datafusion-comet]
via GitHub
[I] Frequent CI failures for Spark 4.0.2 / JDK 21 [datafusion-comet]
via GitHub
Re: [I] Add support for `size` expression [datafusion-comet]
via GitHub
Re: [I] Add support for `size` expression [datafusion-comet]
via GitHub
Re: [I] Add support for scalar UDFs that operate on Arrow data [datafusion-comet]
via GitHub
Re: [I] Add support for scalar UDFs that operate on Arrow data [datafusion-comet]
via GitHub
[I] Upgrade workspace to Rust 1.95 [datafusion]
via GitHub
[PR] Update Rust toolchain to 1.95 [datafusion]
via GitHub
Re: [I] [DISCUSSION] Future of Dynamic Filters Sync [datafusion]
via GitHub
Re: [I] [DISCUSSION] Future of Dynamic Filters Sync [datafusion]
via GitHub
[PR] fix: propagate inner-field metadata through make_array and array_agg [datafusion]
via GitHub
[PR] ci: switch 8 workflows to ubuntu-slim [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
Re: [PR] ci: use ubuntu-slim for lightweight jobs [datafusion-comet]
via GitHub
[PR] Brent/case hash fix [datafusion]
via GitHub
Re: [PR] Brent/case hash fix [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
Re: [PR] fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning. [datafusion]
via GitHub
[PR] refactor: merge comet-common module into comet-spark [datafusion-comet]
via GitHub
[PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
Re: [PR] feat: fix windows decimal casting frame [datafusion]
via GitHub
[PR] docs: add versioning policy [datafusion-comet]
via GitHub
[I] Physical planning CPU blowup hashing nested CASE expressions [datafusion]
via GitHub
Re: [I] Physical planning CPU blowup hashing nested CASE expressions [datafusion]
via GitHub
Re: [I] Physical planning CPU blowup hashing nested CASE expressions [datafusion]
via GitHub
Re: [I] Physical planning CPU blowup hashing nested CASE expressions [datafusion]
via GitHub
Re: [I] Physical planning CPU blowup hashing nested CASE expressions [datafusion]
via GitHub
[PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
Re: [PR] perf: bypass values.value(i) for inline strings in ArrowBytesViewMap [datafusion]
via GitHub
[PR] perf: Optimize `translate` to use new bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `translate` to use new bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `translate` to use new bulk-NULL string builders [datafusion]
via GitHub
Re: [PR] perf: Optimize `translate` to use new bulk-NULL string builders [datafusion]
via GitHub
[PR] feat: detect Iceberg V2 writes and emit fall-back reasons [datafusion-comet]
via GitHub
[I] Writes to Apache Iceberg Tables [datafusion-comet]
via GitHub
[PR] feat(datetime): prototype JVM UDF path for Hour/Minute/Second (engine=java) [datafusion-comet]
via GitHub
[I] Optimize `translate` using `append_with` [datafusion]
via GitHub
Re: [I] Optimize `translate` using `append_with` [datafusion]
via GitHub
[PR] feat(udf): account JVM-UDF Arrow allocations to the Spark task [datafusion-comet]
via GitHub
[PR] Add support for logical and physical codecs [datafusion-python]
via GitHub
Re: [PR] Add support for logical and physical codecs [datafusion-python]
via GitHub
Re: [PR] Add support for logical and physical codecs [datafusion-python]
via GitHub
Re: [PR] Add support for logical and physical codecs [datafusion-python]
via GitHub
[PR] Make use of Swatinem/rust-cache to make the CI workflows faster [datafusion-ballista]
via GitHub
[PR] docs: show child links on Expression Compatibility page [datafusion-comet]
via GitHub
Re: [PR] docs: show child links on Expression Compatibility page [datafusion-comet]
via GitHub
[I] AbstractMethodError: CometBroadcastExchangeExec missing sparkContext() from BroadcastExchangeLike [datafusion-comet]
via GitHub
[I] Create Comet versioning policy [datafusion-comet]
via GitHub
Re: [I] Introduce `StringViewArrayBuilder::map` to avoid duplication [datafusion]
via GitHub
Re: [PR] test: add test that validate partial reduce with different number of state fields [datafusion]
via GitHub
[PR] Support DISTINCT ON with aggregation and windows [datafusion]
via GitHub
[PR] [TUI] Add a config setting for rendering job stage's plan as a tree [datafusion-ballista]
via GitHub
Re: [D] DISCUSSION: Apache DataFusion New York Meetup May 2026 [datafusion]
via GitHub
Re: [D] DISCUSSION: Apache DataFusion New York Meetup May 2026 [datafusion]
via GitHub
[D] DataFusion-Federation: Union Flattening Across Executors [datafusion]
via GitHub
[PR] feat(dataframe): add executeStream(allocator) for incremental batch iteration [datafusion-java]
via GitHub
Re: [PR] feat(dataframe): add executeStream(allocator) for incremental batch iteration [datafusion-java]
via GitHub
[PR] fix: REST API does not show running jobs [datafusion-ballista]
via GitHub
Re: [PR] fix: REST API does not show running jobs [datafusion-ballista]
via GitHub
Re: [PR] fix: REST API does not show running jobs [datafusion-ballista]
via GitHub
[I] feat(dataframe): add executeStream(allocator) for incremental batch iteration [datafusion-java]
via GitHub
[I] CREATE TABLE AS not checking column unicity [datafusion]
via GitHub
Re: [I] CREATE TABLE AS not checking column unicity [datafusion]
via GitHub
[PR] Refactor Spark `format_string` numeric `%c` conversion dispatch [datafusion]
via GitHub
[PR] fix: reduce memory allocation overhead during partial aggregation ear… [datafusion]
via GitHub
[I] Extra memory allocated during partial aggregation early emit during OOM handling [datafusion]
via GitHub
[I] Refactor: Centralize numeric `%c` formatting dispatch in format_string.rs [datafusion]
via GitHub
[PR] Add blog: Sort Pushdown in DataFusion: Skip Sorts, Skip I/O [datafusion-site]
via GitHub
[PR] feat(builder): expose ConfigOptions.set/get as setOption / setOptions / getOption [datafusion-java]
via GitHub
Re: [PR] feat(builder): expose ConfigOptions.set/get as setOption / setOptions / getOption [datafusion-java]
via GitHub
Re: [PR] feat(builder): expose ConfigOptions.set/get as setOption / setOptions / getOption [datafusion-java]
via GitHub
Re: [PR] feat(builder): expose ConfigOptions.set/get as setOption / setOptions / getOption [datafusion-java]
via GitHub
[I] feat: expose ConfigOptions.set as a generic SessionContextBuilder.setOption(key, value) [datafusion-java]
via GitHub
Re: [I] feat: expose ConfigOptions.set/get as generic SessionContextBuilder.setOption / SessionContext.getOption [datafusion-java]
via GitHub
Re: [PR] Split proto serialization to encapsulate private state (#21835) [datafusion]
via GitHub
[PR] chore(deps): bump pytest from 9.0.2 to 9.0.3 in /python [datafusion-ballista]
via GitHub
Re: [PR] chore(deps): bump pytest from 9.0.2 to 9.0.3 in /python [datafusion-ballista]
via GitHub
[PR] Fix extension type metadata propagation through casts [datafusion]
via GitHub
[PR] Optimize away unused `UNNEST` under duplicate-insensitive aggregates [datafusion]
via GitHub
[PR] build(deps): bump pyjwt from 2.10.1 to 2.12.0 [datafusion-python]
via GitHub
[PR] feat(parquet): two-stage access-plan hooks with shared async reader [datafusion]
via GitHub
[PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat(json): expose NdJsonReadOptions via registerJson and readJson [datafusion-java]
via GitHub
Re: [PR] feat: support optional threshold parameter for levenshtein function [datafusion]
via GitHub
[I] KEDA scaler `pending_jobs` metric appears insufficient for scaling due to rapid task assignment by scheduler [datafusion-ballista]
via GitHub
[PR] feat: add Java scalar UDF support [datafusion-java]
via GitHub
Re: [PR] test: add SQL test coverage for spark.sql.legacy.timeParserPolicy [datafusion-comet]
via GitHub
Re: [PR] test: add SQL test coverage for spark.sql.legacy.timeParserPolicy [datafusion-comet]
via GitHub
[PR] build(deps): bump pygments from 2.19.1 to 2.20.0 [datafusion-python]
via GitHub
[PR] chore(deps): bump pyjwt from 2.10.1 to 2.12.0 in /python [datafusion-ballista]
via GitHub
Re: [PR] chore(deps): bump pyjwt from 2.10.1 to 2.12.0 in /python [datafusion-ballista]
via GitHub
[PR] build(deps): bump requests from 2.32.3 to 2.33.0 [datafusion-python]
via GitHub
Re: [I] [Spark 4.0] Add string collation support [datafusion-comet]
via GitHub
[I] feat(dataframe): expose withColumn and unnestColumns [datafusion-java]
via GitHub
[I] feat(dataframe): expose introspection methods (schema, explain, cache, describe) [datafusion-java]
via GitHub
[I] design: DataFrame joins (join, joinOn) and the Java Expr question [datafusion-java]
via GitHub
[I] feat(dataframe): expose set operations (union, intersect, except) [datafusion-java]
via GitHub
[I] feat(dataframe): expose sort and repartition [datafusion-java]
via GitHub
[I] native_datafusion: ParquetSchemaConvert error does not include the file path [datafusion-comet]
via GitHub
Re: [I] native_datafusion: ParquetSchemaConvert error does not include the file path [datafusion-comet]
via GitHub
[I] feat: add DataFrame.writeCsv with CsvWriteOptions [datafusion-java]
via GitHub
[I] feat: expose Avro reader via registerAvro and readAvro [datafusion-java]
via GitHub
[I] bug: SessionContext.close() / DataFrame.close() race with concurrent JNI calls (use-after-free) [datafusion-java]
via GitHub
[I] feat: add DataFrame.writeJson with JsonWriteOptions [datafusion-java]
via GitHub
[I] feat: expose JSON reader via registerJson and readJson [datafusion-java]
via GitHub
[I] feat: expose Arrow IPC reader via registerArrow and readArrow [datafusion-java]
via GitHub
[PR] docs: remove project-status checklist [datafusion-java]
via GitHub
Re: [PR] docs: remove project-status checklist [datafusion-java]
via GitHub
[PR] build(deps): bump urllib3 from 2.3.0 to 2.7.0 [datafusion-python]
via GitHub
Re: [PR] feat: Native Delta Lake scan via delta-kernel-rs [datafusion-comet]
via GitHub
Re: [PR] feat: Native Delta Lake scan via delta-kernel-rs [datafusion-comet]
via GitHub
[I] Publish fat JAR with platform-specific native libraries to Maven Central [datafusion-java]
via GitHub
[PR] build(deps): bump pynacl from 1.5.0 to 1.6.2 [datafusion-python]
via GitHub
[PR] build: add examples module on a multi-module Maven build [datafusion-java]
via GitHub
Re: [PR] build: add examples module on a multi-module Maven build [datafusion-java]
via GitHub
[PR] docs: publish Javadoc as part of the User Guide [datafusion-java]
via GitHub
Re: [PR] docs: publish Javadoc as part of the User Guide [datafusion-java]
via GitHub
[PR] build(deps): bump cryptography from 44.0.0 to 46.0.7 [datafusion-python]
via GitHub
Re: [I] Automate breaking change detection [datafusion]
via GitHub
Re: [I] Automate breaking change detection [datafusion]
via GitHub
[PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] [WIP] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Re: [PR] Call take arrays once per repartitioned input batch [datafusion]
via GitHub
Earlier messages