[GitHub] [arrow] chrisirhc opened a new issue, #34377: [Go] enhancement request to expose AnyValue() on Scalar

2023-02-27 Thread via GitHub
chrisirhc opened a new issue, #34377: URL: https://github.com/apache/arrow/issues/34377 ### Describe the enhancement requested I wanted to gauge interest in a method on the Scalar interface to expose the value via any/interface{} like: ```go type Scalar interface { …

[GitHub] [arrow] code1704 opened a new issue, #34376: arrow table: how to drop duplicates?

2023-02-27 Thread via GitHub
code1704 opened a new issue, #34376: URL: https://github.com/apache/arrow/issues/34376 ### Describe the usage question you have. Please include as many useful details as possible. In pandas DataFrame there's `pandas.DataFrame.drop_duplicates`. With arrow table, how can I d

[GitHub] [arrow] wgtmac opened a new issue, #34375: [C++][Parquet] Page header does not save statistics once page index is enabled

2023-02-27 Thread via GitHub
wgtmac opened a new issue, #34375: URL: https://github.com/apache/arrow/issues/34375 ### Describe the enhancement requested Once writing page index is supported, we should not save page statistics in the data page header as it is duplicated. Although page stats is disabled by

[GitHub] [arrow] rachtsingh closed issue #34353: [Python] Add support for string + string add_checked

2023-02-27 Thread via GitHub
rachtsingh closed issue #34353: [Python] Add support for string + string add_checked URL: https://github.com/apache/arrow/issues/34353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow] westonpace opened a new issue, #34374: [C++] Investigate regressions caused by changing row group size from 64Mi to 1Mi.

2023-02-27 Thread via GitHub
westonpace opened a new issue, #34374: URL: https://github.com/apache/arrow/issues/34374 ### Describe the bug, including details regarding any error messages, version, and platform. It appears that changing the default row group size from 64Mi rows to 1Mi rows had a [significant cha

[GitHub] [arrow] wjones127 closed issue #25986: [C++][Parquet] Enable external material and rotation for encryption keys

2023-02-27 Thread via GitHub
wjones127 closed issue #25986: [C++][Parquet] Enable external material and rotation for encryption keys URL: https://github.com/apache/arrow/issues/25986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] sl2902 closed issue #34370: [Python]pyarrow.lib.ArrowNotImplementedError

2023-02-27 Thread via GitHub
sl2902 closed issue #34370: [Python]pyarrow.lib.ArrowNotImplementedError URL: https://github.com/apache/arrow/issues/34370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow] thisisnic opened a new issue, #34372: [R] CRAN packaging checklist for version 11.0.0.3

2023-02-27 Thread via GitHub
thisisnic opened a new issue, #34372: URL: https://github.com/apache/arrow/issues/34372 # Packaging checklist for CRAN release For a high-level overview of the release process see the [Apache Arrow Release Management Guide](https://arrow.apache.org/docs/developers/release.html#post

[GitHub] [arrow-adbc] tokoko opened a new issue, #479: [Java] driver/flight-sql: Add Dremio validation suite

2023-02-27 Thread via GitHub
tokoko opened a new issue, #479: URL: https://github.com/apache/arrow-adbc/issues/479 I plan to run validation suite against Dremio Flight SQL interface, but I'd like to hear some feedback on my initial thoughts. I think we need a `flight-sql-validation-dremio` maven submodule to acc

[GitHub] [arrow] lidavidm closed issue #34367: [Java] Java Integration Test Failures

2023-02-27 Thread via GitHub
lidavidm closed issue #34367: [Java] Java Integration Test Failures URL: https://github.com/apache/arrow/issues/34367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [arrow] sl2902 opened a new issue, #34370: [Python]pyarrow.lib.ArrowNotImplementedError

2023-02-27 Thread via GitHub
sl2902 opened a new issue, #34370: URL: https://github.com/apache/arrow/issues/34370 ### Describe the bug, including details regarding any error messages, version, and platform. I am building a custom dataset using HuggingFace Datasets. However, I run into an issue with the schema.

[GitHub] [arrow-testing] lidavidm commented on pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm commented on PR #85: URL: https://github.com/apache/arrow-testing/pull/85#issuecomment-1446705143 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow-testing] lidavidm merged pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm merged PR #85: URL: https://github.com/apache/arrow-testing/pull/85 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.ap

[GitHub] [arrow] tustvold opened a new issue, #34367: Java Integration Test Failures

2023-02-27 Thread via GitHub
tustvold opened a new issue, #34367: URL: https://github.com/apache/arrow/issues/34367 ### Describe the bug, including details regarding any error messages, version, and platform. We are seeing failures building Java within the archery integration test framework https://github

[GitHub] [arrow] DavisVaughan opened a new issue, #34366: Don't `getFromNamespace()` the `dplyr:::check_name()` helper

2023-02-27 Thread via GitHub
DavisVaughan opened a new issue, #34366: URL: https://github.com/apache/arrow/issues/34366 ### Describe the bug, including details regarding any error messages, version, and platform. Right here arrow grabs the `dplyr:::check_name()` helper: https://github.com/apache/arrow/blob/96

[GitHub] [arrow] wjones127 closed issue #33652: [C++][Parquet] Interface total_bytes_written is Confusing

2023-02-27 Thread via GitHub
wjones127 closed issue #33652: [C++][Parquet] Interface total_bytes_written is Confusing URL: https://github.com/apache/arrow/issues/33652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-testing] lidavidm commented on pull request #85: GH-15203: [C++][Java] Add files with uncompressible buffers

2023-02-27 Thread via GitHub
lidavidm commented on PR #85: URL: https://github.com/apache/arrow-testing/pull/85#issuecomment-1446539126 @wjones127 mind taking a glance here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] thisisnic opened a new issue, #34365: [C++] Substrait cast expression fails on input types other than field reference

2023-02-27 Thread via GitHub
thisisnic opened a new issue, #34365: URL: https://github.com/apache/arrow/issues/34365 ### Describe the bug, including details regarding any error messages, version, and platform. The Substrait cast function has an `input` argument which takes an `Expression`; however, the current i

[GitHub] [arrow] lidavidm closed issue #33839: [Java][FlightRPC] FlightCallHeaders added to FlightClient::getStream aren't attached to call

2023-02-27 Thread via GitHub
lidavidm closed issue #33839: [Java][FlightRPC] FlightCallHeaders added to FlightClient::getStream aren't attached to call URL: https://github.com/apache/arrow/issues/33839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] lidavidm closed issue #33953: [Java] JDBC driver does not send custom headers on DoGet

2023-02-27 Thread via GitHub
lidavidm closed issue #33953: [Java] JDBC driver does not send custom headers on DoGet URL: https://github.com/apache/arrow/issues/33953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] lidavidm opened a new issue, #34364: [Java] Replace checkstyle with google-java-format or another formatter?

2023-02-27 Thread via GitHub
lidavidm opened a new issue, #34364: URL: https://github.com/apache/arrow/issues/34364 ### Describe the enhancement requested checkstyle is unfortunately only a linter and cannot auto-format. google-java-format (or possibly some other plugin like clang-format) can be run to check and

[GitHub] [arrow] lidavidm closed issue #32954: [C++][Java][FlightRPC] Get rid of FlightTestUtil.getStartedServer etc.

2023-02-27 Thread via GitHub
lidavidm closed issue #32954: [C++][Java][FlightRPC] Get rid of FlightTestUtil.getStartedServer etc. URL: https://github.com/apache/arrow/issues/32954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-adbc] lidavidm closed issue #471: [Java] JDBC driver fails on a composite primary key

2023-02-27 Thread via GitHub
lidavidm closed issue #471: [Java] JDBC driver fails on a composite primary key URL: https://github.com/apache/arrow-adbc/issues/471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] legout opened a new issue, #34363: Writing to cloudflare r2 fails for mutlipart upload

2023-02-27 Thread via GitHub
legout opened a new issue, #34363: URL: https://github.com/apache/arrow/issues/34363 ### Describe the bug, including details regarding any error messages, version, and platform. When I try to write a pyarrow.table to cloudflare r2 object store, I got an error when files are larger th

[GitHub] [arrow] westonpace closed issue #34280: [C++][Python] Clarify meaning of "row_group_size" and change default to something more reasonable

2023-02-27 Thread via GitHub
westonpace closed issue #34280: [C++][Python] Clarify meaning of "row_group_size" and change default to something more reasonable URL: https://github.com/apache/arrow/issues/34280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] thisisnic closed issue #34310: [C++] Executing Substrait plan containing round function causes segfault or error

2023-02-27 Thread via GitHub
thisisnic closed issue #34310: [C++] Executing Substrait plan containing round function causes segfault or error URL: https://github.com/apache/arrow/issues/34310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [arrow] thisisnic closed issue #34305: [C++] Error when running Substrait plan using `extract` function: "argument was not an enum"

2023-02-27 Thread via GitHub
thisisnic closed issue #34305: [C++] Error when running Substrait plan using `extract` function: "argument was not an enum" URL: https://github.com/apache/arrow/issues/34305 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] jorisvandenbossche opened a new issue, #34361: [C++] Handling "logical" nulls: add GetLogicalNullCount / update IsNull() to check for logical null

2023-02-27 Thread via GitHub
jorisvandenbossche opened a new issue, #34361: URL: https://github.com/apache/arrow/issues/34361 There are some data types where the nulls are not stored "physically" using a validity bitmap on the parent ArrayData, but through nulls in child data: - UnionArrays don't have a top-level

[GitHub] [arrow] jorisvandenbossche closed issue #34359: [Python] Add select method to pyarrow.RecordBatch

2023-02-27 Thread via GitHub
jorisvandenbossche closed issue #34359: [Python] Add select method to pyarrow.RecordBatch URL: https://github.com/apache/arrow/issues/34359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] thisisnic closed issue #32512: [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN

2023-02-27 Thread via GitHub
thisisnic closed issue #32512: [R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN URL: https://github.com/apache/arrow/issues/32512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] AlenkaF opened a new issue, #34359: [Python] Add select method to pyarrow.RecordBatch

2023-02-27 Thread via GitHub
AlenkaF opened a new issue, #34359: URL: https://github.com/apache/arrow/issues/34359 ### Describe the enhancement requested There is a `select` method defined for `pa.Table` https://github.com/apache/arrow/blob/db60be2bc2e1fc9aec43bb632be894bb8da6c77b/python/pyarrow/table.pxi#