[I] [Python] Casting list elements to non-nullable doesn't check for nulls [arrow]

2024-07-03 Thread via GitHub
adamreeve opened a new issue, #43146: URL: https://github.com/apache/arrow/issues/43146 ### Describe the bug, including details regarding any error messages, version, and platform. PyArrow correctly raises an error if trying to cast an array containing nulls to a non-nullable field:

[I] [Python] PyArrow allows creating non-nullable columns containing nulls [arrow]

2024-07-03 Thread via GitHub
adamreeve opened a new issue, #43145: URL: https://github.com/apache/arrow/issues/43145 ### Describe the bug, including details regarding any error messages, version, and platform. Reproduced using PyArrow 16.1.0 on Fedora 39 Linux. The null values are still displayed as null when pr

Re: [I] [C++][Parquet] Reading corrupted encrypted Parquet files can cause a segfault [arrow]

2024-07-03 Thread via GitHub
mapleFU closed issue #43070: [C++][Parquet] Reading corrupted encrypted Parquet files can cause a segfault URL: https://github.com/apache/arrow/issues/43070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[I] [C++][Parquet] Minor: Default initialize some variables in Parquet metadata [arrow]

2024-07-03 Thread via GitHub
mapleFU opened a new issue, #43143: URL: https://github.com/apache/arrow/issues/43143 ### Describe the enhancement requested Some variables in Parquet metadata is not been initialized. This patch initialize them ### Component(s) C++, Parquet -- This is an automated me

Re: [I] feat(c/driver/postgresql): Add support for bulk ingestion of list types [arrow-adbc]

2024-07-03 Thread via GitHub
lidavidm closed issue #1882: feat(c/driver/postgresql): Add support for bulk ingestion of list types URL: https://github.com/apache/arrow-adbc/issues/1882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] feat(c/driver/postgresql): Add support for bulk ingestion of list types [arrow-adbc]

2024-07-03 Thread via GitHub
lidavidm closed issue #1882: feat(c/driver/postgresql): Add support for bulk ingestion of list types URL: https://github.com/apache/arrow-adbc/issues/1882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[I] [C++][Parquet] Refactor Encryptor API to use arrow::util::span instead of raw pointers [arrow]

2024-07-03 Thread via GitHub
adamreeve opened a new issue, #43142: URL: https://github.com/apache/arrow/issues/43142 ### Describe the enhancement requested This change was made to the Decryptor API in #43071, we should do the same for the Encryptor API for consistency and to make the code more maintainable.

[I] Fix CI for Python wheels [arrow-adbc]

2024-07-03 Thread via GitHub
WillAyd opened a new issue, #1963: URL: https://github.com/apache/arrow-adbc/issues/1963 ### What happened? Looks like CentOS 7 (which manylinux 2014 wheels are built off of) reached EOL on June 30 https://github.com/pypa/manylinux I think at this point we need to start

[I] [C++][Parquet] BloomFilter writer: Estimate the bloom filter quality [arrow]

2024-07-03 Thread via GitHub
mapleFU opened a new issue, #43138: URL: https://github.com/apache/arrow/issues/43138 ### Describe the enhancement requested The patch is working on bloomfilter https://github.com/apache/arrow/pull/37400 Currently, the bloom filter user should explicitly know the filter fpp an

Re: [I] [CI][Packaging] Could not resolve host: mirrorlist.centos.org; Unknown error on wheel almalinux 2014 jobs [arrow]

2024-07-03 Thread via GitHub
raulcd closed issue #43119: [CI][Packaging] Could not resolve host: mirrorlist.centos.org; Unknown error on wheel almalinux 2014 jobs URL: https://github.com/apache/arrow/issues/43119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[I] [R] Change the binary type mapping to `blob::blob` [arrow]

2024-07-03 Thread via GitHub
eitsupi opened a new issue, #43135: URL: https://github.com/apache/arrow/issues/43135 ### Describe the enhancement requested Currently mapping by defining own class like `arrow_binary`, why not use the `blob` class like nanoarrow? ### Component(s) R -- This is an auto

Re: [I] [CI][Packaging][RPM][CentOS] Could not resolve host: mirrorlist.centos.org [arrow]

2024-07-03 Thread via GitHub
raulcd closed issue #43122: [CI][Packaging][RPM][CentOS] Could not resolve host: mirrorlist.centos.org URL: https://github.com/apache/arrow/issues/43122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[I] [CI][C++] MinGW jobs failed with GTest::gmock_main related error [arrow]

2024-07-03 Thread via GitHub
kou opened a new issue, #43134: URL: https://github.com/apache/arrow/issues/43134 ### Describe the bug, including details regarding any error messages, version, and platform. https://github.com/apache/arrow/actions/runs/9775229516/job/26985190635?pr=43133#step:7:1001 ```text

[I] [CI] Rat check with pre-commit doesn't work [arrow]

2024-07-03 Thread via GitHub
kou opened a new issue, #43132: URL: https://github.com/apache/arrow/issues/43132 ### Describe the bug, including details regarding any error messages, version, and platform. Our pre-commit configuration uses `.tar` as RAT input but it seems that ### Component(s) Contin

[I] [CI] Attach lint failures to PR diff view [arrow]

2024-07-03 Thread via GitHub
kou opened a new issue, #43131: URL: https://github.com/apache/arrow/issues/43131 ### Describe the enhancement requested `pre-commit run --show-diff-on-failure` shows lint failure diffs. We can attach the failure information to PR diff view by converting it to "error" workflow comman

[I] [C++][ArrowFlight] Crash due to UCS thread mode [arrow]

2024-07-03 Thread via GitHub
amirgon opened a new issue, #43130: URL: https://github.com/apache/arrow/issues/43130 ### Describe the bug, including details regarding any error messages, version, and platform. ArrowFlight UCX transport sets `UCS_THREAD_MODE_SERIALIZED` mode: https://github.com/apache/arrow/

[I] [C++][Compute] Row table is consuming more memory than expected [arrow]

2024-07-03 Thread via GitHub
zanmato1984 opened a new issue, #43129: URL: https://github.com/apache/arrow/issues/43129 ### Describe the enhancement requested When I was investigating #43116 , found that row table consumed more memory than expected. For example, the questioning test case in #41336 encodes data a

[I] Not able to import pyarrow.parquet [arrow]

2024-07-03 Thread via GitHub
Akhil-77 opened a new issue, #43126: URL: https://github.com/apache/arrow/issues/43126 ### Describe the usage question you have. Please include as many useful details as possible. I am trying to load datasets library from Hugging Face but it gives me this error. > Imp