[GitHub] [arrow] assignUser closed issue #29741: [C++][Docs][Parquet] Trouble installing on Cent OS 7

2023-02-10 Thread via GitHub
assignUser closed issue #29741: [C++][Docs][Parquet] Trouble installing on Cent OS 7 URL: https://github.com/apache/arrow/issues/29741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow] assignUser closed issue #18865: [C++][Build] Cannot build with Parquet/Thrift support on CentOS 7

2023-02-10 Thread via GitHub
assignUser closed issue #18865: [C++][Build] Cannot build with Parquet/Thrift support on CentOS 7 URL: https://github.com/apache/arrow/issues/18865 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] wgtmac opened a new issue, #34139: [C++][Parquet] Ignore corrupted or invalid statistics

2023-02-10 Thread via GitHub
wgtmac opened a new issue, #34139: URL: https://github.com/apache/arrow/issues/34139 ### Describe the bug, including details regarding any error messages, version, and platform. https://github.com/apache/arrow/pull/34112 fixes reading from stats.min_value and stats.max_value where ap

[GitHub] [arrow] wgtmac opened a new issue, #34138: [C++][Parquet] Fix parsing stats from min_value/max_value

2023-02-10 Thread via GitHub
wgtmac opened a new issue, #34138: URL: https://github.com/apache/arrow/issues/34138 ### Describe the bug, including details regarding any error messages, version, and platform. The code below does not check and read from stats.min_value/max_value. If reading from a parquet file wher

[GitHub] [arrow] westonpace closed issue #34059: [C++] Create a fetch node based on a batch index property

2023-02-10 Thread via GitHub
westonpace closed issue #34059: [C++] Create a fetch node based on a batch index property URL: https://github.com/apache/arrow/issues/34059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] westonpace opened a new issue, #34136: [C++] Add the concept of "ordering" to an exec node, reject non-sensible plans

2023-02-10 Thread via GitHub
westonpace opened a new issue, #34136: URL: https://github.com/apache/arrow/issues/34136 ### Describe the enhancement requested Every node has an "ordering" which describes what the batch index of the batches produced by that node corresponds to. Source nodes will generally hav

[GitHub] [arrow] westonpace opened a new issue, #34135: [C++] Parallel asof join node

2023-02-10 Thread via GitHub
westonpace opened a new issue, #34135: URL: https://github.com/apache/arrow/issues/34135 ### Describe the enhancement requested Now that we are starting to introduce formal ordering we can create an AsofJoinNode variant that works even if use_threads is true. A rough overview of the

[GitHub] [arrow] assignUser opened a new issue, #34132: [Dev] Add script to keep artifactory mirror of bundled dependencies in sync

2023-02-10 Thread via GitHub
assignUser opened a new issue, #34132: URL: https://github.com/apache/arrow/issues/34132 ### Describe the enhancement requested At this point we have to manually get the dependencies and upload them to jfrog. THere should be a script to automate this. ### Component(s) De

[GitHub] [arrow] assignUser opened a new issue, #34131: [CI] Use artifactory mirror for bundled dependencies in CI job

2023-02-10 Thread via GitHub
assignUser opened a new issue, #34131: URL: https://github.com/apache/arrow/issues/34131 ### Describe the enhancement requested We should use the bundled dependencies from the artifactory mirror in at least on nightly build so we see when there are issues e.g. missing new versions.

[GitHub] [arrow] assignUser opened a new issue, #34130: [Dev][C++] Don't use GitHub archive files with checksums

2023-02-10 Thread via GitHub
assignUser opened a new issue, #34130: URL: https://github.com/apache/arrow/issues/34130 ### Describe the bug, including details regarding any error messages, version, and platform. Recently it became apparent that the often used github archive links are not hash stable https://gith

[GitHub] [arrow] assignUser opened a new issue, #34129: [Dev][Release] Add all bundled dependencies to artifactory mirror

2023-02-10 Thread via GitHub
assignUser opened a new issue, #34129: URL: https://github.com/apache/arrow/issues/34129 ### Describe the enhancement requested While updating the artifactory mirror of bundled dependencies I noticed that a bunch of new dependencies do not have the cmake logic that allows them to fal

[GitHub] [arrow] westonpace closed issue #33899: [C++] Add `NamedTapRel` relation as a Substrait extension

2023-02-10 Thread via GitHub
westonpace closed issue #33899: [C++] Add `NamedTapRel` relation as a Substrait extension URL: https://github.com/apache/arrow/issues/33899 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] assignUser closed issue #29773: link error on ubuntu

2023-02-10 Thread via GitHub
assignUser closed issue #29773: link error on ubuntu URL: https://github.com/apache/arrow/issues/29773 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

[GitHub] [arrow] paleolimbot closed issue #33904: [R] Using s3_bucket with non-AWS S3-compatible storage is confusing

2023-02-10 Thread via GitHub
paleolimbot closed issue #33904: [R] Using s3_bucket with non-AWS S3-compatible storage is confusing URL: https://github.com/apache/arrow/issues/33904 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wjones127 closed issue #34086: [C++][Parquet] Parquet V2 page headers have incorrect number of rows

2023-02-10 Thread via GitHub
wjones127 closed issue #34086: [C++][Parquet] Parquet V2 page headers have incorrect number of rows URL: https://github.com/apache/arrow/issues/34086 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wjones127 closed issue #33115: [C++] Parquet support read page with crc32 checking

2023-02-10 Thread via GitHub
wjones127 closed issue #33115: [C++] Parquet support read page with crc32 checking URL: https://github.com/apache/arrow/issues/33115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [arrow] westonpace opened a new issue, #34123: [Python] Expose nested function registries

2023-02-10 Thread via GitHub
westonpace opened a new issue, #34123: URL: https://github.com/apache/arrow/issues/34123 ### Describe the enhancement requested We have the ability to created nested function registries in C++. This is useful to allow UDFs to be scoped to a query. We should expose this feature in p

[GitHub] [arrow] westonpace opened a new issue, #34122: [C++] Use special URI to allow calling UDFs via Substrait

2023-02-10 Thread via GitHub
westonpace opened a new issue, #34122: URL: https://github.com/apache/arrow/issues/34122 ### Describe the enhancement requested We should key on a special URI (e.g. https://apache.org/arrow/udf) to recognize that a Substrait call is actually looking for an Acero registered UDF. Then

[GitHub] [arrow] Kodiologist opened a new issue, #34121: Allow converting strings to dates without using datetimes as an intermediate step

2023-02-10 Thread via GitHub
Kodiologist opened a new issue, #34121: URL: https://github.com/apache/arrow/issues/34121 ### Describe the enhancement requested ``` import pyarrow as pa, pyarrow.compute as C x = pyarrow.array(['2008-01-01', '2008-01-02', '2008-01-03']) ``` This works fine: ```

[GitHub] [arrow] nbro10 opened a new issue, #34120: Cannot install pyarrow in MacOS Monterey (12.5.1) with M1 in a Python 3.7.13

2023-02-10 Thread via GitHub
nbro10 opened a new issue, #34120: URL: https://github.com/apache/arrow/issues/34120 ### Describe the usage question you have. Please include as many useful details as possible. I created and activated a Python 3.7.13 virtual environment. I am managing my Python versions with `p

[GitHub] [arrow] DanTm99 opened a new issue, #34119: Add [] operator to Schema

2023-02-10 Thread via GitHub
DanTm99 opened a new issue, #34119: URL: https://github.com/apache/arrow/issues/34119 ### Describe the enhancement requested Add the `[]` operator to `Schema` which calls `GetFieldByIndex` or `GetFieldByName`. ### Component(s) C# -- This is an automated message from t

[GitHub] [arrow] wence- opened a new issue, #34118: Allow configuration of size of AWS event loop thread pool

2023-02-10 Thread via GitHub
wence- opened a new issue, #34118: URL: https://github.com/apache/arrow/issues/34118 ### Describe the enhancement requested When calling `DoInitializeS3`, arrow creates initialises the AWS API, which by default creates a thread pool for the background AWS event loop that uses one thr

[GitHub] [arrow] pulkomandy closed issue #34093: [C++] Cross compiling in Yocto is not working

2023-02-10 Thread via GitHub
pulkomandy closed issue #34093: [C++] Cross compiling in Yocto is not working URL: https://github.com/apache/arrow/issues/34093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] ktf opened a new issue, #34117: Extend cast operators for int8

2023-02-10 Thread via GitHub
ktf opened a new issue, #34117: URL: https://github.com/apache/arrow/issues/34117 ### Describe the enhancement requested In our analysis, we have the need to produce int8 data which can then optionally be processed as int32 or float32. ### Component(s) C++ - Gandiva --