[I] [C++][Parquet] Page sizes for repeated columns can overflow int32 with page index enabled [arrow]

2025-07-07 Thread via GitHub
adamreeve opened a new issue, #47027: URL: https://github.com/apache/arrow/issues/47027 ### Describe the bug, including details regarding any error messages, version, and platform. I noticed a regression when upgrading from Arrow 19.0.1 to 20.0.0 and writing Parquet files with a repe

Re: [I] [C++][Docs] Increase minimum gcc version required for building Arrow C++ from 7.1 to 9 [arrow]

2025-07-07 Thread via GitHub
kou closed issue #47025: [C++][Docs] Increase minimum gcc version required for building Arrow C++ from 7.1 to 9 URL: https://github.com/apache/arrow/issues/47025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[I] [C++] Increase minimum gcc version required for building Arrow C++ from 7.1 to 9 [arrow]

2025-07-07 Thread via GitHub
amoeba opened a new issue, #47025: URL: https://github.com/apache/arrow/issues/47025 ### Describe the enhancement requested https://github.com/apache/arrow/pull/46813 introduced a requirement on gcc>=9 due to triggering a bug in gcc<9 related to lambda capture expressions. Since ther

Re: [I] [Docs][Python][C++] Asof join documentation is wrong/incomplete [arrow]

2025-07-07 Thread via GitHub
zanmato1984 closed issue #46897: [Docs][Python][C++] Asof join documentation is wrong/incomplete URL: https://github.com/apache/arrow/issues/46897 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [CI][C++] test-r-linux-sanitizers is failing [arrow]

2025-07-07 Thread via GitHub
kou closed issue #46995: [CI][C++] test-r-linux-sanitizers is failing URL: https://github.com/apache/arrow/issues/46995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] python: adbc_driver_manager fails to build under conda on macOS [arrow-adbc]

2025-07-07 Thread via GitHub
lidavidm closed issue #3080: python: adbc_driver_manager fails to build under conda on macOS URL: https://github.com/apache/arrow-adbc/issues/3080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Avro Adapter - Dictionary Encoding [arrow-java]

2025-07-07 Thread via GitHub
lidavidm closed issue #731: Avro Adapter - Dictionary Encoding URL: https://github.com/apache/arrow-java/issues/731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[I] Arrow TypeInferrer fails on maps with non-string or binary keys [arrow]

2025-07-07 Thread via GitHub
alexeykudinkin opened a new issue, #47023: URL: https://github.com/apache/arrow/issues/47023 ### Describe the bug, including details regarding any error messages, version, and platform. Currently, `TypeInferrer` fails with the following exception when you're trying to use keys with n

Re: [I] Integer overflow in the ExportedAllocationOwner [arrow]

2025-07-07 Thread via GitHub
CurtHagenlocher closed issue #47009: Integer overflow in the ExportedAllocationOwner URL: https://github.com/apache/arrow/issues/47009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] [C++][Parquet] Used-after-poison when PlainDecoder::decodeArrow [arrow]

2025-07-07 Thread via GitHub
amoeba closed issue #46988: [C++][Parquet] Used-after-poison when PlainDecoder::decodeArrow URL: https://github.com/apache/arrow/issues/46988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[I] Unable to import arrow table to pandas if it has categorical columns with index types of unsigned ints [arrow]

2025-07-07 Thread via GitHub
dweih opened a new issue, #47022: URL: https://github.com/apache/arrow/issues/47022 ### Describe the bug, including details regarding any error messages, version, and platform. Our code primarily uses polars but external tools use pandas, and when we use them to import parquet files

[I] Avro adapter - Read and write Avro container files [arrow-java]

2025-07-07 Thread via GitHub
martin-traverse opened a new issue, #794: URL: https://github.com/apache/arrow-java/issues/794 ### Describe the enhancement requested Part 4 in the Avro series, following on from #731. This will allow reading and writing whole files in the Avro container format as a series of batches.

Re: [I] Support cast float16 array to float32/float64 [arrow-go]

2025-07-07 Thread via GitHub
zeroshade closed issue #424: Support cast float16 array to float32/float64 URL: https://github.com/apache/arrow-go/issues/424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] [MATLAB] Add tests verifying `arrow.array.Array.fromMATLAB()` throws an exception if given an array with the wrong type. [arrow]

2025-07-07 Thread via GitHub
sgilmore10 closed issue #35644: [MATLAB] Add tests verifying `arrow.array.Array.fromMATLAB()` throws an exception if given an array with the wrong type. URL: https://github.com/apache/arrow/issues/35644 -- This is an automated message from the Apache Git Service. To respond to the message,

[I] [C++][FlightSQL][ODBC][SQLBindCol] support returning values for indicator pointer with null data pointer [arrow]

2025-07-07 Thread via GitHub
alinaliBQ opened a new issue, #47021: URL: https://github.com/apache/arrow/issues/47021 ### Describe the enhancement requested Current behavior: Flight SQL ODBC doesn't return data length inside indicator pointer if data pointer is null when [`SQLBindCol`](https://learn.microsoft.com

Re: [I] [MATLAB] Add a common `arrow.tabular.Tabular` MATLAB interface [arrow]

2025-07-07 Thread via GitHub
sgilmore10 closed issue #38214: [MATLAB] Add a common `arrow.tabular.Tabular` MATLAB interface URL: https://github.com/apache/arrow/issues/38214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[I] [CI][Packaging] Almalinux 10 nightly job fails Parsing armored OpenPGP packet(s) [arrow]

2025-07-07 Thread via GitHub
raulcd opened a new issue, #47018: URL: https://github.com/apache/arrow/issues/47018 ### Describe the bug, including details regarding any error messages, version, and platform. The [Almalinux 10 nightly job](https://github.com/ursacomputing/crossbow/actions/runs/16112118570/job/4545

[I] [C++][FlightSQL] Fix negative timestamps to date types [arrow]

2025-07-07 Thread via GitHub
Antropovi opened a new issue, #47016: URL: https://github.com/apache/arrow/issues/47016 ### Describe the bug, including details regarding any error messages, version, and platform. Dates before 1970 is not showing correctly when using flight odbc in excel on windows. ### Compo

[I] [CI][C++] Crashes on AMD64 Conda C++ AVX2 [arrow]

2025-07-07 Thread via GitHub
pitrou opened a new issue, #47015: URL: https://github.com/apache/arrow/issues/47015 ### Describe the bug, including details regarding any error messages, version, and platform. On various PRs, crashes [have appeared](https://github.com/apache/arrow/actions/runs/16118470582/job/45477

[I] [C++][Parquet] RecordReader does not correctly reserve memory for BYTE_ARRAY and FLBA [arrow]

2025-07-07 Thread via GitHub
pitrou opened a new issue, #47012: URL: https://github.com/apache/arrow/issues/47012 ### Describe the enhancement requested When reading a Parquet leaf column as Arrow, we [presize the Arrow builder](https://github.com/apache/arrow/blob/a0cc2d8ed35dce7ee6c3e7cbcc4867216a9ef16f/cpp/src

[I] ci: add musl builds to the nightly-verify workflow [arrow-adbc]

2025-07-07 Thread via GitHub
eitsupi opened a new issue, #3107: URL: https://github.com/apache/arrow-adbc/issues/3107 ### What feature or improvement would you like to see? A follow up for #3105 Currently the build to musl target is not tested on CI and a bug was overlooked. It could be tested by adding

[I] Rust: Split adbc_core into several crates [arrow-adbc]

2025-07-07 Thread via GitHub
eitsupi opened a new issue, #3106: URL: https://github.com/apache/arrow-adbc/issues/3106 ### What feature or improvement would you like to see? As pointed out by @lidavidm[^1], splitting the adbc_core crate seems worthwhile for a better separation of concerns. [^1]: https://gi

[I] [Python] Make CSV support in PyArrow configurable [arrow]

2025-07-07 Thread via GitHub
AlenkaF opened a new issue, #47010: URL: https://github.com/apache/arrow/issues/47010 ### Describe the enhancement requested We could make CSV support in PyArrow configurable (e.g. with a `PYARROW_WITH_CSV` flag), and skip building the CSV module if it’s not available in the Arrow C+

Re: [I] [Python] PyArrow fails compiling without CSV enabled [arrow]

2025-07-07 Thread via GitHub
AlenkaF closed issue #46946: [Python] PyArrow fails compiling without CSV enabled URL: https://github.com/apache/arrow/issues/46946 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[I] Integer overflow in the ExportedAllocationOwner [arrow]

2025-07-07 Thread via GitHub
marcin-krystianc opened a new issue, #47009: URL: https://github.com/apache/arrow/issues/47009 ### Describe the bug, including details regarding any error messages, version, and platform. Writing a large parquet file (many rows and columns, large row group size) can result in an exce

Re: [I] Create an announce GitHub Discussions on release [arrow-js]

2025-07-07 Thread via GitHub
raulcd closed issue #195: Create an announce GitHub Discussions on release URL: https://github.com/apache/arrow-js/issues/195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Create an announce GitHub Discussions on release [arrow-swift]

2025-07-07 Thread via GitHub
raulcd closed issue #65: Create an announce GitHub Discussions on release URL: https://github.com/apache/arrow-swift/issues/65 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] [CI][Dev] Fix shellcheck errors in the ci/scripts/install_sccache.sh [arrow]

2025-07-07 Thread via GitHub
kou closed issue #46909: [CI][Dev] Fix shellcheck errors in the ci/scripts/install_sccache.sh URL: https://github.com/apache/arrow/issues/46909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Generate release announce e-mail draft [arrow-swift]

2025-07-07 Thread via GitHub
raulcd closed issue #66: Generate release announce e-mail draft URL: https://github.com/apache/arrow-swift/issues/66 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [I] [Python] Severe performance regression in isin() filter After pyarrow v18 [arrow]

2025-07-07 Thread via GitHub
raulcd closed issue #46777: [Python] Severe performance regression in isin() filter After pyarrow v18 URL: https://github.com/apache/arrow/issues/46777 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t