[GitHub] [arrow] mapleFU opened a new issue, #34335: [C++][Parquet] Improve the performance for Decoding DELTA_LENGTH_BYTE_ARRAY

2023-02-24 Thread via GitHub
mapleFU opened a new issue, #34335: URL: https://github.com/apache/arrow/issues/34335 ### Describe the enhancement requested To be honest, the previous logic is too slow. It's 20 time slower than PLAIN for ByteArray. I'll do some optimizations on it. ### Component(s) C++

[GitHub] [arrow] adamkennedy opened a new issue, #34338: [Java] BaseAllocator.DEBUG should be opt-in as HistoricalLog is immensely expensive and breaks profiling

2023-02-24 Thread via GitHub
adamkennedy opened a new issue, #34338: URL: https://github.com/apache/arrow/issues/34338 ### Describe the bug, including details regarding any error messages, version, and platform. BaseAllocator.DEBUG is currently enabled automatically any time assertions are enabled via -ea which

[GitHub] [arrow] eitsupi opened a new issue, #34339: [R] Add `skip_rows_after_names` option to `read_csv`'s options

2023-02-24 Thread via GitHub
eitsupi opened a new issue, #34339: URL: https://github.com/apache/arrow/issues/34339 ### Describe the bug, including details regarding any error messages, version, and platform. Add an option to skip rows after the column names. This was implemented in C++ and Python (#28410, #102

[GitHub] [arrow] abcbarryn opened a new issue, #34341: Arrow version 5 or later fails to compile/link with gcc version 7 (or earlier)

2023-02-24 Thread via GitHub
abcbarryn opened a new issue, #34341: URL: https://github.com/apache/arrow/issues/34341 ### Describe the bug, including details regarding any error messages, version, and platform. When linking the shared library with gcc 7 I get several errors like this: ``_ZN5arrow6ResultISt

[GitHub] [arrow] kou closed issue #34329: [C++] Memory access out of bounds in arrow-1.0.0

2023-02-24 Thread via GitHub
kou closed issue #34329: [C++] Memory access out of bounds in arrow-1.0.0 URL: https://github.com/apache/arrow/issues/34329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [arrow] kou closed issue #15209: [C++][Gandiva] Add abs function

2023-02-24 Thread via GitHub
kou closed issue #15209: [C++][Gandiva] Add abs function URL: https://github.com/apache/arrow/issues/15209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

[GitHub] [arrow-adbc] adamkennedy opened a new issue, #477: [Java] driver/jdbc: No way to provide an external BufferAllocator to JdbcDriver or JdbcDatabase instances

2023-02-24 Thread via GitHub
adamkennedy opened a new issue, #477: URL: https://github.com/apache/arrow-adbc/issues/477 JdbcDriver is an enum singleton that assigns it's own RootAllocator that is never subsequently destroyed. This is a problem both because there is no ability to control or share memory pools bet

[GitHub] [arrow] westonpace opened a new issue, #34344: [C++] Pass function registry to dataset operations

2023-02-24 Thread via GitHub
westonpace opened a new issue, #34344: URL: https://github.com/apache/arrow/issues/34344 ### Describe the enhancement requested The datasets API does various compute operations on expressions. All of these operations today use the default function registry. This could prevent us fr

[GitHub] [arrow] westonpace opened a new issue, #34346: [C++] Allow the CSV reader to read zero columns

2023-02-24 Thread via GitHub
westonpace opened a new issue, #34346: URL: https://github.com/apache/arrow/issues/34346 ### Describe the enhancement requested There are times we may want to scan a CSV file and just count the lines. One example is when we want to count the # of rows in a dataset. Another ex

[GitHub] [arrow] westonpace closed issue #15059: [C++] The new scan node should use values from fragment guarantees instead of loading them from disk

2023-02-24 Thread via GitHub
westonpace closed issue #15059: [C++] The new scan node should use values from fragment guarantees instead of loading them from disk URL: https://github.com/apache/arrow/issues/15059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] westonpace opened a new issue, #34347: [C++] Add an end-to-end fuzz test for the new scan node

2023-02-24 Thread via GitHub
westonpace opened a new issue, #34347: URL: https://github.com/apache/arrow/issues/34347 ### Describe the enhancement requested The scanner has quite a few unit tests. However, I think it would benefit from some more robust end-to-end testing as well. As further justification,

[GitHub] [arrow] mapleFU opened a new issue, #34351: [C++][Parquet] Statistics Merge ignore setting `has` flag

2023-02-25 Thread via GitHub
mapleFU opened a new issue, #34351: URL: https://github.com/apache/arrow/issues/34351 ### Describe the bug, including details regarding any error messages, version, and platform. In `src/parquet/statistics.cc`: ```c++ void Merge(const TypedStatistics& other) override {

[GitHub] [arrow] NoahFournier opened a new issue, #34352: Add support for SingularOrList expressions from Substrait

2023-02-25 Thread via GitHub
NoahFournier opened a new issue, #34352: URL: https://github.com/apache/arrow/issues/34352 ### Describe the enhancement requested There's presently no way to express `IN`-type queries from a Substrait plan, as there's no support for the Substrait OrList equality expression. It'd be g

[GitHub] [arrow] rachtsingh opened a new issue, #34353: Add support for string + string add_checked

2023-02-25 Thread via GitHub
rachtsingh opened a new issue, #34353: URL: https://github.com/apache/arrow/issues/34353 ### Describe the enhancement requested Please consider adding (vectorized) support for adding strings to strings: ```python In [1]: df = pd.DataFrame({'a': ['b'], 'c': ['d']}).astype(dtyp

[GitHub] [arrow-flight-sql-postgresql] kou closed issue #11: Add support for returning integer

2023-02-25 Thread via GitHub
kou closed issue #11: Add support for returning integer URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow-flight-sql-postgresql] kou opened a new issue, #18: Add support for authentication

2023-02-25 Thread via GitHub
kou opened a new issue, #18: URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/18 We need to implement authentication feature by ourselves because PostgreSQL's authentication related API is based on the PostgreSQL protocol. See also: https://git.postgresql.org/gitweb/

[GitHub] [arrow] Ben-Epstein opened a new issue, #34354: `to_numpy().tolist()` is significantlly faster than `.tolist()`

2023-02-25 Thread via GitHub
Ben-Epstein opened a new issue, #34354: URL: https://github.com/apache/arrow/issues/34354 ### Describe the bug, including details regarding any error messages, version, and platform. pyarrow version `'9.0.0'` numpy version `1.23.5` Code to reproduce ``` import pyarrow

[GitHub] [arrow-flight-sql-postgresql] kou closed issue #12: Add support for `SELECT FROM`

2023-02-25 Thread via GitHub
kou closed issue #12: Add support for `SELECT FROM` URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/12 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow] nvartolomei closed issue #13927: Why signed integer type is used for array sizes rather than unsigned?

2023-02-26 Thread via GitHub
nvartolomei closed issue #13927: Why signed integer type is used for array sizes rather than unsigned? URL: https://github.com/apache/arrow/issues/13927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-flight-sql-postgresql] kou opened a new issue, #23: Measure performance with large `integer` only table

2023-02-26 Thread via GitHub
kou opened a new issue, #23: URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/23 We need numbers to evaluate whether the current approach is reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] AlenkaF opened a new issue, #34359: [Python] Add select method to pyarrow.RecordBatch

2023-02-27 Thread via GitHub
AlenkaF opened a new issue, #34359: URL: https://github.com/apache/arrow/issues/34359 ### Describe the enhancement requested There is a `select` method defined for `pa.Table` https://github.com/apache/arrow/blob/db60be2bc2e1fc9aec43bb632be894bb8da6c77b/python/pyarrow/table.pxi#

[GitHub] [arrow] thisisnic closed issue #32512: [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN

2023-02-27 Thread via GitHub
thisisnic closed issue #32512: [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN URL: https://github.com/apache/arrow/issues/32512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] jorisvandenbossche closed issue #34359: [Python] Add select method to pyarrow.RecordBatch

2023-02-27 Thread via GitHub
jorisvandenbossche closed issue #34359: [Python] Add select method to pyarrow.RecordBatch URL: https://github.com/apache/arrow/issues/34359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] jorisvandenbossche opened a new issue, #34361: [C++] Handling "logical" nulls: add GetLogicalNullCount / update IsNull() to check for logical null

2023-02-27 Thread via GitHub
jorisvandenbossche opened a new issue, #34361: URL: https://github.com/apache/arrow/issues/34361 There are some data types where the nulls are not stored "physically" using a validity bitmap on the parent ArrayData, but through nulls in child data: - UnionArrays don't have a top-level

[GitHub] [arrow] thisisnic closed issue #34305: [C++] Error when running Substrait plan using `extract` function: "argument was not an enum"

2023-02-27 Thread via GitHub
thisisnic closed issue #34305: [C++] Error when running Substrait plan using `extract` function: "argument was not an enum" URL: https://github.com/apache/arrow/issues/34305 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] thisisnic closed issue #34310: [C++] Executing Substrait plan containing round function causes segfault or error

2023-02-27 Thread via GitHub
thisisnic closed issue #34310: [C++] Executing Substrait plan containing round function causes segfault or error URL: https://github.com/apache/arrow/issues/34310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [arrow] westonpace closed issue #34280: [C++][Python] Clarify meaning of "row_group_size" and change default to something more reasonable

2023-02-27 Thread via GitHub
westonpace closed issue #34280: [C++][Python] Clarify meaning of "row_group_size" and change default to something more reasonable URL: https://github.com/apache/arrow/issues/34280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] legout opened a new issue, #34363: Writing to cloudflare r2 fails for mutlipart upload

2023-02-27 Thread via GitHub
legout opened a new issue, #34363: URL: https://github.com/apache/arrow/issues/34363 ### Describe the bug, including details regarding any error messages, version, and platform. When I try to write a pyarrow.table to cloudflare r2 object store, I got an error when files are larger th

[GitHub] [arrow-adbc] lidavidm closed issue #471: [Java] JDBC driver fails on a composite primary key

2023-02-27 Thread via GitHub
lidavidm closed issue #471: [Java] JDBC driver fails on a composite primary key URL: https://github.com/apache/arrow-adbc/issues/471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] lidavidm closed issue #32954: [C++][Java][FlightRPC] Get rid of FlightTestUtil.getStartedServer etc.

2023-02-27 Thread via GitHub
lidavidm closed issue #32954: [C++][Java][FlightRPC] Get rid of FlightTestUtil.getStartedServer etc. URL: https://github.com/apache/arrow/issues/32954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] lidavidm opened a new issue, #34364: [Java] Replace checkstyle with google-java-format or another formatter?

2023-02-27 Thread via GitHub
lidavidm opened a new issue, #34364: URL: https://github.com/apache/arrow/issues/34364 ### Describe the enhancement requested checkstyle is unfortunately only a linter and cannot auto-format. google-java-format (or possibly some other plugin like clang-format) can be run to check and

[GitHub] [arrow] lidavidm closed issue #33953: [Java] JDBC driver does not send custom headers on DoGet

2023-02-27 Thread via GitHub
lidavidm closed issue #33953: [Java] JDBC driver does not send custom headers on DoGet URL: https://github.com/apache/arrow/issues/33953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] lidavidm closed issue #33839: [Java][FlightRPC] FlightCallHeaders added to FlightClient::getStream aren't attached to call

2023-02-27 Thread via GitHub
lidavidm closed issue #33839: [Java][FlightRPC] FlightCallHeaders added to FlightClient::getStream aren't attached to call URL: https://github.com/apache/arrow/issues/33839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] thisisnic opened a new issue, #34365: [C++] Substrait cast expression fails on input types other than field reference

2023-02-27 Thread via GitHub
thisisnic opened a new issue, #34365: URL: https://github.com/apache/arrow/issues/34365 ### Describe the bug, including details regarding any error messages, version, and platform. The Substrait cast function has an `input` argument which takes an `Expression`; however, the current i

[GitHub] [arrow-testing] lidavidm commented on pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm commented on PR #85: URL: https://github.com/apache/arrow-testing/pull/85#issuecomment-1446539126 @wjones127 mind taking a glance here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] wjones127 closed issue #33652: [C++][Parquet] Interface total_bytes_written is Confusing

2023-02-27 Thread via GitHub
wjones127 closed issue #33652: [C++][Parquet] Interface total_bytes_written is Confusing URL: https://github.com/apache/arrow/issues/33652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow] DavisVaughan opened a new issue, #34366: Don't `getFromNamespace()` the `dplyr:::check_name()` helper

2023-02-27 Thread via GitHub
DavisVaughan opened a new issue, #34366: URL: https://github.com/apache/arrow/issues/34366 ### Describe the bug, including details regarding any error messages, version, and platform. Right here arrow grabs the `dplyr:::check_name()` helper: https://github.com/apache/arrow/blob/96

[GitHub] [arrow] tustvold opened a new issue, #34367: Java Integration Test Failures

2023-02-27 Thread via GitHub
tustvold opened a new issue, #34367: URL: https://github.com/apache/arrow/issues/34367 ### Describe the bug, including details regarding any error messages, version, and platform. We are seeing failures building Java within the archery integration test framework https://github

[GitHub] [arrow-testing] lidavidm merged pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm merged PR #85: URL: https://github.com/apache/arrow-testing/pull/85 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.ap

[GitHub] [arrow-testing] lidavidm commented on pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm commented on PR #85: URL: https://github.com/apache/arrow-testing/pull/85#issuecomment-1446705143 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow] sl2902 opened a new issue, #34370: [Python]pyarrow.lib.ArrowNotImplementedError

2023-02-27 Thread via GitHub
sl2902 opened a new issue, #34370: URL: https://github.com/apache/arrow/issues/34370 ### Describe the bug, including details regarding any error messages, version, and platform. I am building a custom dataset using HuggingFace Datasets. However, I run into an issue with the schema.

[GitHub] [arrow] lidavidm closed issue #34367: [Java] Java Integration Test Failures

2023-02-27 Thread via GitHub
lidavidm closed issue #34367: [Java] Java Integration Test Failures URL: https://github.com/apache/arrow/issues/34367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [arrow-adbc] tokoko opened a new issue, #479: [Java] driver/flight-sql: Add Dremio validation suite

2023-02-27 Thread via GitHub
tokoko opened a new issue, #479: URL: https://github.com/apache/arrow-adbc/issues/479 I plan to run validation suite against Dremio Flight SQL interface, but I'd like to hear some feedback on my initial thoughts. I think we need a `flight-sql-validation-dremio` maven submodule to acc

[GitHub] [arrow] thisisnic opened a new issue, #34372: [R] CRAN packaging checklist for version 11.0.0.3

2023-02-27 Thread via GitHub
thisisnic opened a new issue, #34372: URL: https://github.com/apache/arrow/issues/34372 # Packaging checklist for CRAN release For a high-level overview of the release process see the [Apache Arrow Release Management Guide](https://arrow.apache.org/docs/developers/release.html#post

[GitHub] [arrow] sl2902 closed issue #34370: [Python]pyarrow.lib.ArrowNotImplementedError

2023-02-27 Thread via GitHub
sl2902 closed issue #34370: [Python]pyarrow.lib.ArrowNotImplementedError URL: https://github.com/apache/arrow/issues/34370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow] wjones127 closed issue #25986: [C++][Parquet] Enable external material and rotation for encryption keys

2023-02-27 Thread via GitHub
wjones127 closed issue #25986: [C++][Parquet] Enable external material and rotation for encryption keys URL: https://github.com/apache/arrow/issues/25986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] westonpace opened a new issue, #34374: [C++] Investigate regressions caused by changing row group size from 64Mi to 1Mi.

2023-02-27 Thread via GitHub
westonpace opened a new issue, #34374: URL: https://github.com/apache/arrow/issues/34374 ### Describe the bug, including details regarding any error messages, version, and platform. It appears that changing the default row group size from 64Mi rows to 1Mi rows had a [significant cha

[GitHub] [arrow] rachtsingh closed issue #34353: [Python] Add support for string + string add_checked

2023-02-27 Thread via GitHub
rachtsingh closed issue #34353: [Python] Add support for string + string add_checked URL: https://github.com/apache/arrow/issues/34353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow] wgtmac opened a new issue, #34375: [C++][Parquet] Page header does not save statistics once page index is enabled

2023-02-27 Thread via GitHub
wgtmac opened a new issue, #34375: URL: https://github.com/apache/arrow/issues/34375 ### Describe the enhancement requested Once writing page index is supported, we should not save page statistics in the data page header as it is duplicated. Although page stats is disabled by

[GitHub] [arrow] code1704 opened a new issue, #34376: arrow table: how to drop duplicates?

2023-02-27 Thread via GitHub
code1704 opened a new issue, #34376: URL: https://github.com/apache/arrow/issues/34376 ### Describe the usage question you have. Please include as many useful details as possible. In pandas DataFrame there's `pandas.DataFrame.drop_duplicates`. With arrow table, how can I d

[GitHub] [arrow] chrisirhc opened a new issue, #34377: [Go] enhancement request to expose AnyValue() on Scalar

2023-02-27 Thread via GitHub
chrisirhc opened a new issue, #34377: URL: https://github.com/apache/arrow/issues/34377 ### Describe the enhancement requested I wanted to gauge interest in a method on the Scalar interface to expose the value via any/interface{} like: ```go type Scalar interface { …

[GitHub] [arrow] jorisvandenbossche closed issue #34098: [Python][Docs] Fix Dataset docstrings

2023-02-28 Thread via GitHub
jorisvandenbossche closed issue #34098: [Python][Docs] Fix Dataset docstrings URL: https://github.com/apache/arrow/issues/34098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] jorisvandenbossche closed issue #33926: [Python] DataFrame Interchange Protocol for pyarrow.RecordBatch

2023-02-28 Thread via GitHub
jorisvandenbossche closed issue #33926: [Python] DataFrame Interchange Protocol for pyarrow.RecordBatch URL: https://github.com/apache/arrow/issues/33926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] jorisvandenbossche closed issue #31937: [Python] Multiindex levels order is not preserved after a from_pandas/to_pandas

2023-02-28 Thread via GitHub
jorisvandenbossche closed issue #31937: [Python] Multiindex levels order is not preserved after a from_pandas/to_pandas URL: https://github.com/apache/arrow/issues/31937 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow] raulcd closed issue #33977: [Dev] Implement automation bot for PRs

2023-02-28 Thread via GitHub
raulcd closed issue #33977: [Dev] Implement automation bot for PRs URL: https://github.com/apache/arrow/issues/33977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] chenyuanxing opened a new issue, #34378: VarBinaryVector has size limit

2023-02-28 Thread via GitHub
chenyuanxing opened a new issue, #34378: URL: https://github.com/apache/arrow/issues/34378 ### Describe the bug, including details regarding any error messages, version, and platform. when I put data into a VarBinaryVector. I meet ``` java.lang.IndexOutOfBoundsException: index:

[GitHub] [arrow] chenyuanxing closed issue #34378: VarBinaryVector has size limit

2023-02-28 Thread via GitHub
chenyuanxing closed issue #34378: VarBinaryVector has size limit URL: https://github.com/apache/arrow/issues/34378 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] kevingurney opened a new issue, #34379: [Website] Website deployment workflow (`deploy.yml`) is failing due to Node.js 18 version bump in`ubuntu-latest` GitHub Actions runner image an

2023-02-28 Thread via GitHub
kevingurney opened a new issue, #34379: URL: https://github.com/apache/arrow/issues/34379 ### Describe the bug, including details regarding any error messages, version, and platform. [A few weeks ago](https://github.com/apache/arrow-site/actions/runs/4263179843/jobs/7419585498), the

[GitHub] [arrow] lidavidm closed issue #34284: [Java] JDBC Flight SQL driver calls prepare() twice

2023-02-28 Thread via GitHub
lidavidm closed issue #34284: [Java] JDBC Flight SQL driver calls prepare() twice URL: https://github.com/apache/arrow/issues/34284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] AlenkaF closed issue #34376: [Python] arrow table: how to drop duplicates?

2023-02-28 Thread via GitHub
AlenkaF closed issue #34376: [Python] arrow table: how to drop duplicates? URL: https://github.com/apache/arrow/issues/34376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow-adbc] lidavidm opened a new issue, #481: [CI] arrow-c-glib/red-arrow version needs to be pinned

2023-02-28 Thread via GitHub
lidavidm opened a new issue, #481: URL: https://github.com/apache/arrow-adbc/issues/481 ``` checking for arrow version (>= 11.0.0)... no installing 'libarrow-dev' native package... failed Failed to run '/usr/bin/sudo -p \[sudo\]\ password\ for\ \%u\ to\ install\ \:\ apt-get insta

[GitHub] [arrow] raulcd opened a new issue, #34381: [Dev] PR Workflow incorrectly tagging for committer reviews

2023-02-28 Thread via GitHub
raulcd opened a new issue, #34381: URL: https://github.com/apache/arrow/issues/34381 ### Describe the bug, including details regarding any error messages, version, and platform. Committer GitHub role is CONTRIBUTOR instead of MEMBER. Example of event payload for event triggered fro

[GitHub] [arrow] felipecrv opened a new issue, #34382: [C++] Run-end encoding and decoding: Add support for more types

2023-02-28 Thread via GitHub
felipecrv opened a new issue, #34382: URL: https://github.com/apache/arrow/issues/34382 ### Describe the enhancement requested #34195 introduces two functions `run_end_encode` and `run_end_decode`. The types supported are: - All `NumericTypes()` - `bool()` - `nul

[GitHub] [arrow] eitsupi opened a new issue, #34383: [R][Docs] Improve docs for read_csv_arrow's usage

2023-02-28 Thread via GitHub
eitsupi opened a new issue, #34383: URL: https://github.com/apache/arrow/issues/34383 ### Describe the enhancement requested Many options exist for `read_csv_arrow`, but their use is not well documented. (e.g. #33708) It would be great if we could detail how to use these by exa

[GitHub] [arrow] YoungRX closed issue #34313: [C++] CountRows() in ParquetFileFormat class is unreasonable

2023-02-28 Thread via GitHub
YoungRX closed issue #34313: [C++] CountRows() in ParquetFileFormat class is unreasonable URL: https://github.com/apache/arrow/issues/34313 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] westonpace closed issue #34136: [C++] Add the concept of "ordering" to an exec node, reject non-sensible plans

2023-02-28 Thread via GitHub
westonpace closed issue #34136: [C++] Add the concept of "ordering" to an exec node, reject non-sensible plans URL: https://github.com/apache/arrow/issues/34136 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] thisisnic closed issue #34339: [R] Add `skip_rows_after_names` option to `read_csv_arrow`'s options

2023-02-28 Thread via GitHub
thisisnic closed issue #34339: [R] Add `skip_rows_after_names` option to `read_csv_arrow`'s options URL: https://github.com/apache/arrow/issues/34339 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] thisisnic closed issue #34366: Don't `getFromNamespace()` the `dplyr:::check_name()` helper

2023-02-28 Thread via GitHub
thisisnic closed issue #34366: Don't `getFromNamespace()` the `dplyr:::check_name()` helper URL: https://github.com/apache/arrow/issues/34366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [arrow] lidavidm opened a new issue, #34385: [Go] Read IPC files with compression enabled but uncompressed buffers

2023-02-28 Thread via GitHub
lidavidm opened a new issue, #34385: URL: https://github.com/apache/arrow/issues/34385 ### Describe the bug, including details regarding any error messages, version, and platform. Observed in #15194; an IPC file with compression may still have uncompressed buffers but Go appears not

[GitHub] [arrow] lidavidm opened a new issue, #34386: [C++] Handle URI paths for filesystems?

2023-02-28 Thread via GitHub
lidavidm opened a new issue, #34386: URL: https://github.com/apache/arrow/issues/34386 ### Describe the enhancement requested Given a URI like `s3://bucket/foo/bar.parquet`, right now, you can turn that URI into a FileSystem instance + a non-URI path. That is useful, but then the Fil

[GitHub] [arrow] westonpace opened a new issue, #34387: [C++] Options for handling non-decomposable aggregate functions

2023-02-28 Thread via GitHub
westonpace opened a new issue, #34387: URL: https://github.com/apache/arrow/issues/34387 ### Describe the enhancement requested Currently we have `SCALAR_AGGREGATE` and `HASH_AGGREGATE` function kinds. These are used for ["decomposable aggregate functions"](https://en.wikipedia.org/

[GitHub] [arrow] benibus opened a new issue, #34388: [C++] Build core compute kernels unconditionally

2023-02-28 Thread via GitHub
benibus opened a new issue, #34388: URL: https://github.com/apache/arrow/issues/34388 ### Describe the enhancement requested From https://github.com/apache/arrow/issues/25025 > Since we are going to implement a lot more precompiled kernels, I am not sure it makes sense to require a

[GitHub] [arrow-flight-sql-postgresql] atcol opened a new issue, #25: Website link is broken

2023-02-28 Thread via GitHub
atcol opened a new issue, #25: URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/25 The website link on the right hand side of this Github's project results in a HTTP 404 error. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-flight-sql-postgresql] kou closed issue #21: Add support for large response

2023-02-28 Thread via GitHub
kou closed issue #21: Add support for large response URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/21 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [arrow] kou closed issue #31412: [Website] Remove support for triggering GitHub Actions apache/arrow-site website deployment workflow when pushing to master

2023-02-28 Thread via GitHub
kou closed issue #31412: [Website] Remove support for triggering GitHub Actions apache/arrow-site website deployment workflow when pushing to master URL: https://github.com/apache/arrow/issues/31412 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] kou closed issue #20161: [Website] Remove mentions of master branch from Apache Arrow website content

2023-02-28 Thread via GitHub
kou closed issue #20161: [Website] Remove mentions of master branch from Apache Arrow website content URL: https://github.com/apache/arrow/issues/20161 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] assignUser opened a new issue, #34389: [Dev] Some assignments from the PR bot fail due to anti-spam protection

2023-02-28 Thread via GitHub
assignUser opened a new issue, #34389: URL: https://github.com/apache/arrow/issues/34389 ### Describe the bug, including details regarding any error messages, version, and platform. If a user has not commented on an issue and does not have write access to the repository it is not pos

[GitHub] [arrow] kou closed issue #34309: [CI][Packaging][deb] Failed to build with bundled AWS SDK C++ on Ubuntu 22.04 and 22.10

2023-02-28 Thread via GitHub
kou closed issue #34309: [CI][Packaging][deb] Failed to build with bundled AWS SDK C++ on Ubuntu 22.04 and 22.10 URL: https://github.com/apache/arrow/issues/34309 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [arrow] rtpsw opened a new issue, #34391: [C++] Future as-of-join-node hangs on distant times

2023-03-01 Thread via GitHub
rtpsw opened a new issue, #34391: URL: https://github.com/apache/arrow/issues/34391 ### Describe the bug, including details regarding any error messages, version, and platform. Future as-of-join-node goes into an infinite loop when right-table times are distant. A specific test case

[GitHub] [arrow] chenyuanxing opened a new issue, #34393: Arrow deserialization performance is so poor in Java

2023-03-01 Thread via GitHub
chenyuanxing opened a new issue, #34393: URL: https://github.com/apache/arrow/issues/34393 ### Describe the usage question you have. Please include as many useful details as possible. We experimented with the same data and found that the performance of Arrow deserialization is o

[GitHub] [arrow] nbro10 opened a new issue, #34395: file INSTALL cannot duplicate symlink because: A directory already exists at that location

2023-03-01 Thread via GitHub
nbro10 opened a new issue, #34395: URL: https://github.com/apache/arrow/issues/34395 ### Describe the bug, including details regarding any error messages, version, and platform. I was having this issue https://github.com/apache/arrow/issues/34120, which I wasn't able to solve. So, fo

[GitHub] [arrow] nbro10 opened a new issue, #34396: Error: Refusing to uninstall `/opt/homebrew/Cellar/apache-arrow/11.0.0`

2023-03-01 Thread via GitHub
nbro10 opened a new issue, #34396: URL: https://github.com/apache/arrow/issues/34396 ### Describe the bug, including details regarding any error messages, version, and platform. I just tried to do another `brew uninstall apache-arrow` after what I did here https://github.com/apache/a

[GitHub] [arrow] pkrefta opened a new issue, #34397: How to handle None values when writing a parquet using pyarrow

2023-03-01 Thread via GitHub
pkrefta opened a new issue, #34397: URL: https://github.com/apache/arrow/issues/34397 ### Describe the usage question you have. Please include as many useful details as possible. Hi, I'm trying to create very simple Parquet file that contains None values ``` impo

[GitHub] [arrow] thisisnic opened a new issue, #34398: [R] Update NEWS.md for 11.0.0.3

2023-03-01 Thread via GitHub
thisisnic opened a new issue, #34398: URL: https://github.com/apache/arrow/issues/34398 ### Describe the enhancement requested Update NEWS.md for 11.0.0.3 ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] pkrefta closed issue #34397: [Python] How to handle None values when writing a parquet using pyarrow

2023-03-01 Thread via GitHub
pkrefta closed issue #34397: [Python] How to handle None values when writing a parquet using pyarrow URL: https://github.com/apache/arrow/issues/34397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] raulcd closed issue #33697: [CI][Python] Nightly test for PySpark 3.2.0 fail with AttributeError on numpy.bool

2023-03-01 Thread via GitHub
raulcd closed issue #33697: [CI][Python] Nightly test for PySpark 3.2.0 fail with AttributeError on numpy.bool URL: https://github.com/apache/arrow/issues/33697 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] langtsai opened a new issue, #34402: [C++][Flight] Unable to receive error status from DoPut

2023-03-01 Thread via GitHub
langtsai opened a new issue, #34402: URL: https://github.com/apache/arrow/issues/34402 ### Describe the bug, including details regarding any error messages, version, and platform. Hi I am am running into an issue where my Flight server DoPut is returning an error, but I am unable to

[GitHub] [arrow] swyatt7 opened a new issue, #34403: Is there a way to construct the metadata_collector for an existing partitioned dataset?

2023-03-01 Thread via GitHub
swyatt7 opened a new issue, #34403: URL: https://github.com/apache/arrow/issues/34403 ### Describe the usage question you have. Please include as many useful details as possible. Hello, I am working with an already partitioned dataset that exists in the hive format as suc

[GitHub] [arrow] jorisvandenbossche opened a new issue, #34404: [Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions)

2023-03-01 Thread via GitHub
jorisvandenbossche opened a new issue, #34404: URL: https://github.com/apache/arrow/issues/34404 We have several failing tests in the nightly build (https://github.com/ursacomputing/crossbow/actions/runs/4277727973/jobs/7446784501) because of a change in pandas 2.0: the Index can now store

[GitHub] [arrow] icexelloss closed issue #34333: [Python] Test run_query with a registered UDF

2023-03-01 Thread via GitHub
icexelloss closed issue #34333: [Python] Test run_query with a registered UDF URL: https://github.com/apache/arrow/issues/34333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] westonpace opened a new issue, #34405: [C++] Make it possible to specify custom field names in QueryOptions

2023-03-01 Thread via GitHub
westonpace opened a new issue, #34405: URL: https://github.com/apache/arrow/issues/34405 ### Describe the enhancement requested This is needed, for example, to run Substrait queries because those queries specify the output column names explicitly. In general, it seems that it would

[GitHub] [arrow] rtpsw opened a new issue, #34407: [C++] Accumulation for segmented aggregation

2023-03-01 Thread via GitHub
rtpsw opened a new issue, #34407: URL: https://github.com/apache/arrow/issues/34407 ### Describe the enhancement requested In segmented aggregation, accumulate a large enough output batch before outputting it. This task is a follow-up on [this post](https://github.com/apache/arrow/pu

[GitHub] [arrow] kou closed issue #34396: [Python] Error: Refusing to uninstall `/opt/homebrew/Cellar/apache-arrow/11.0.0`

2023-03-01 Thread via GitHub
kou closed issue #34396: [Python] Error: Refusing to uninstall `/opt/homebrew/Cellar/apache-arrow/11.0.0` URL: https://github.com/apache/arrow/issues/34396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] chenyuanxing closed issue #34393: [Java] Arrow deserialization performance is so poor in Java

2023-03-01 Thread via GitHub
chenyuanxing closed issue #34393: [Java] Arrow deserialization performance is so poor in Java URL: https://github.com/apache/arrow/issues/34393 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] multimeric opened a new issue, #34409: Named lists cannot be serialized to a map column

2023-03-01 Thread via GitHub
multimeric opened a new issue, #34409: URL: https://github.com/apache/arrow/issues/34409 ### Describe the bug, including details regarding any error messages, version, and platform. Let's start with a simple table that contains a named list in each row: ```R > x = tibble::tibble

[GitHub] [arrow] jorisvandenbossche opened a new issue, #34410: [Python][C++] No longer possible to specify higher chunksize than the default for Parquet writing

2023-03-02 Thread via GitHub
jorisvandenbossche opened a new issue, #34410: URL: https://github.com/apache/arrow/issues/34410 ### Describe the bug, including details regarding any error messages, version, and platform. See https://github.com/apache/arrow/issues/34374#issuecomment-1449926603 for context ht

[GitHub] [arrow] jorisvandenbossche opened a new issue, #34411: [Python] accept pyarrow Array object in the array() constructor

2023-03-02 Thread via GitHub
jorisvandenbossche opened a new issue, #34411: URL: https://github.com/apache/arrow/issues/34411 Subset of https://github.com/apache/arrow/issues/21761 specifically for accepting Array objects in the `pa.array(..)` function. -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow] jorisvandenbossche closed issue #34411: [Python] accept pyarrow Array object in the array() constructor

2023-03-02 Thread via GitHub
jorisvandenbossche closed issue #34411: [Python] accept pyarrow Array object in the array() constructor URL: https://github.com/apache/arrow/issues/34411 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] AlenkaF opened a new issue, #34412: [Python] Converting python array to TimestampArray with naive datetime and datetime with various timezones

2023-03-02 Thread via GitHub
AlenkaF opened a new issue, #34412: URL: https://github.com/apache/arrow/issues/34412 ### Describe the bug, including details regarding any error messages, version, and platform. When converting a python array with datetime elements and mixed timezones into a pyarrow array there are

[GitHub] [arrow-adbc] lidavidm opened a new issue, #486: [Docs] Update README, landing page

2023-03-02 Thread via GitHub
lidavidm opened a new issue, #486: URL: https://github.com/apache/arrow-adbc/issues/486 - Summarize what ADBC is - Summarize the benefits - Summarize current status and how to get involved -- This is an automated message from the Apache Git Service. To respond to the message, please

<    9   10   11   12   13   14   15   16   17   18   >