Re: [I] Add test CI: Docker based [arrow-java]

2024-11-27 Thread via GitHub
kou closed issue #6: Add test CI: Docker based URL: https://github.com/apache/arrow-java/issues/6 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues

Re: [I] Add test CI: Docker based [arrow-java]

2024-11-27 Thread via GitHub
kou closed issue #6: Add test CI: Docker based URL: https://github.com/apache/arrow-java/issues/6 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues

Re: [I] pre-commit: ShellCheck [arrow-java]

2024-11-27 Thread via GitHub
kou closed issue #10: pre-commit: ShellCheck URL: https://github.com/apache/arrow-java/issues/10 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-

Re: [I] pre-commit: ShellCheck [arrow-java]

2024-11-27 Thread via GitHub
kou closed issue #10: pre-commit: ShellCheck URL: https://github.com/apache/arrow-java/issues/10 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-

[I] [Java][FlightRPC] Java arrow flight server stuck in reader.getDescriptor() [arrow-java]

2024-11-27 Thread via GitHub
billyean opened a new issue, #431: URL: https://github.com/apache/arrow-java/issues/431 ### Describe the usage question you have. Please include as many useful details as possible. Arrow version: 15. Client: C++ Server: Java I have a Java arrow flight server that uses do

[I] [Archery] Suppress pull/push progress logs only in CI [arrow]

2024-11-27 Thread via GitHub
kou opened a new issue, #44878: URL: https://github.com/apache/arrow/issues/44878 ### Describe the enhancement requested https://github.com/apache/arrow/pull/44669 went overboard by always suppressing Docker progress output. This may be desirable on CI, to avoid flooding the logs, bu

Re: [I] [Java] Replace checkstyle with google-java-format or another formatter? [arrow-java]

2024-11-27 Thread via GitHub
lidavidm closed issue #213: [Java] Replace checkstyle with google-java-format or another formatter? URL: https://github.com/apache/arrow-java/issues/213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[I] Issues compiling with gcc-14 [arrow]

2024-11-27 Thread via GitHub
olafx opened a new issue, #44877: URL: https://github.com/apache/arrow/issues/44877 ### Describe the bug, including details regarding any error messages, version, and platform. I have 3 compilers: GCC-14, Clang-19, Apple Clang 16. I'm compiling `arrow/cpp/examples/minimal_build` with

Re: [I] [C++] Push down projection and selection to S3 Select [arrow]

2024-11-27 Thread via GitHub
ianmcook closed issue #18506: [C++] Push down projection and selection to S3 Select URL: https://github.com/apache/arrow/issues/18506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] R package fails to build against arrow 18.1.0: `error: 'ChunkLocation' in namespace 'arrow::internal' does not name a type` [arrow]

2024-11-27 Thread via GitHub
assignUser closed issue #44863: R package fails to build against arrow 18.1.0: `error: 'ChunkLocation' in namespace 'arrow::internal' does not name a type` URL: https://github.com/apache/arrow/issues/44863 -- This is an automated message from the Apache Git Service. To respond to the message,

[I] [Python][Java][Docs] Update the "Integrating PyArrow with Java" documentation section to not use `_import_from_c`/`_export_to_c` [arrow]

2024-11-27 Thread via GitHub
jorisvandenbossche opened a new issue, #44872: URL: https://github.com/apache/arrow/issues/44872 Currently, the "Integrating PyArrow with Java" page (https://arrow.apache.org/docs/13.0/python/integration/python_java.html) has an example that uses the low-level `_import_from_c` / `_export_to

[I] [Java/Python] Add unit test for pyarrow.timeX types in Array.from_jvm [arrow]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #44875: URL: https://github.com/apache/arrow/issues/44875 Follow-up after https://github.com/apache/arrow-java/issues/420 as we are missing the necessary methods to construct these arrays conveniently on the Python side. Once there is a path to construct

[I] [Java/Python]  Add unit test for pyarrow.decimal128 in Array.from_jvm [arrow]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #44874: URL: https://github.com/apache/arrow/issues/44874 Follow-up after https://github.com/apache/arrow-java/issues/420. We need to find the correct code to construct Java decimals and fill them into a `DecimalVector`. Afterwards, we should activate the decim

[I] [Java/Python] Support VarCharVector / StringArray in pyarrow.Array.from_jvm [arrow]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #44873: URL: https://github.com/apache/arrow/issues/44873 Follow-up after https://github.com/apache/arrow-java/issues/420: Currently only primitive arrays are supported in `pyarrow.Array.from_jvm` as it uses `pyarrow.Array.from_buffers` underneath. We should ex

Re: [I] [Python][Parquet] Parquet files created from Pandas dataframes with Arrow-backed list columns cannot be read by pd.read_parquet [arrow]

2024-11-27 Thread via GitHub
jorisvandenbossche closed issue #39914: [Python][Parquet] Parquet files created from Pandas dataframes with Arrow-backed list columns cannot be read by pd.read_parquet URL: https://github.com/apache/arrow/issues/39914 -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [Python] Create Python examples of HTTP GET Arrow client and server using HTTP compression [arrow]

2024-11-27 Thread via GitHub
ianmcook closed issue #40601: [Python] Create Python examples of HTTP GET Arrow client and server using HTTP compression URL: https://github.com/apache/arrow/issues/40601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[I] [Java] JDBC test failures for Windows build with timezone != UTC [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #387: URL: https://github.com/apache/arrow-java/issues/387 Run "mvn install" in the java directory on a system with timezone set to PST (Pacific Standard, 8 hours behind UTC). For the arrow-jdbc module, the following errors occur: `[INFO]` `[INF

[I] [Python] Segfault in `to_pandas()` on batch from IPC stream in specific edge cases [arrow]

2024-11-27 Thread via GitHub
Tom-Newton opened a new issue, #44869: URL: https://github.com/apache/arrow/issues/44869 ### Describe the bug, including details regarding any error messages, version, and platform. So far I've only been able to reproduce this case with `pyspark` but I think the bug is probably on th

[I] Error Building Arrow [arrow]

2024-11-27 Thread via GitHub
stemillington-flock opened a new issue, #44868: URL: https://github.com/apache/arrow/issues/44868 ### Describe the bug, including details regarding any error messages, version, and platform. I am trying to build arrow following the instructions [here](https://arrow.apache.org/docs/de

[I] [Java] gRPC not available on M1 [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #359: URL: https://github.com/apache/arrow-java/issues/359 When building on M1 gRPC is not found. It can be [manually downloaded](https://repo1.maven.org/maven2/io/grpc/protoc-gen-grpc-java/1.41.0/protoc-gen-grpc-java-1.41.0-osx-x86_64.exe) and installed:

[I] [Java/Python]  Add unit test for pyarrow.decimal128 in Array.from_jvm [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #374: URL: https://github.com/apache/arrow-java/issues/374 Follow-up after https://github.com/apache/arrow/issues/18209. We need to find the correct code to construct Java decimals and fill them into a `DecimalVector`. Afterwards, we should activate the decimal

[I] [Java] Add accessors to get type parameters from vector classes [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #427: URL: https://github.com/apache/arrow-java/issues/427 Vector classes contain private copies of each param in the `ArrowType`, but does not have any public api to access them. So if given a vector you would have to get the `Field` from the and cast to the

[I] [Java] Adapt to Java 9/9+ automatic module system [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #379: URL: https://github.com/apache/arrow-java/issues/379 In Java 9/9+, within following module-info.java definition the build will fail by error: ```Java module my.app { exports my.app; requires arrow.memory.core; requires arro

[I] [Java] Investigate potential performance improvement of compression codec [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #388: URL: https://github.com/apache/arrow-java/issues/388 In response to the discussion in https://github.com/apache/arrow/pull/8949/files#r588046787 There are some performance penalties in the implementation of the compression codecs (e.g. data copying

Re: [I] Transfer the Java related open issues in apache/arrow to apache/arrow-java [arrow-java]

2024-11-27 Thread via GitHub
assignUser closed issue #3: Transfer the Java related open issues in apache/arrow to apache/arrow-java URL: https://github.com/apache/arrow-java/issues/3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] [C++] ThreadPool code minor optimization [arrow]

2024-11-27 Thread via GitHub
mapleFU closed issue #44811: [C++] ThreadPool code minor optimization URL: https://github.com/apache/arrow/issues/44811 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[I] [Java] Add getInitReservation() to BufferAllocator interface similar to getLimit(), getHeadRoom() APIs [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #421: URL: https://github.com/apache/arrow-java/issues/421 For capturing additional information for debugging/profiling purposes, it will be useful to expose the init reservation for buffer allocator.  I would encourage someone new to the community to do

[I] [Java] allocate new buffer code doesn't release extra allocated buffer properly [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #417: URL: https://github.com/apache/arrow-java/issues/417 [Class BaseValueVector](https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java)  's method allocFixedDataAndValidityBufs on line#162 allocates

[I] [Java][FlightRPC] Flaky test TestDoExchange.tearDown [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #384: URL: https://github.com/apache/arrow-java/issues/384   ``` [INFO] --- 7029[INFO] T E S T S 7030[INFO]

[I] [Java][Release] Checkstyle plugin fails on ORC log4j.properties files [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #381: URL: https://github.com/apache/arrow-java/issues/381 ```Java [INFO] --- maven-checkstyle-plugin:3.1.0:check (validate) @ arrow-orc --- [INFO] Starting audit... [WARNING] ../../../cpp/java-build/orc_ep-prefix/src/orc_ep/java/bench/core/src/reso

[I] [Integration][Java] Fix map type to allow non-standard field names [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #397: URL: https://github.com/apache/arrow-java/issues/397 Java should support the integration test added in ARROW-7173. **Reporter**: [Antoine Pitrou](https://issues.apache.org/jira/browse/ARROW-8715) / @pitrou **Note**: *This issue was originally

[I] [Java] Minor types don't account for nullable FieldType flag [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #429: URL: https://github.com/apache/arrow-java/issues/429 Calling e.g. `FLOAT4.getNewVector("foo", new FieldType(false, ...), ...)" returns a NullableFloat4Vector instead of a Float4Vector. edit: Float4Vector doesn't implement FieldVector, so can't currently

[I] [Java] publishing nightly snapshot java artifacts to maven repo [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #428: URL: https://github.com/apache/arrow-java/issues/428 The [Snapshot repository](https://repository.apache.org/content/groups/snapshots/org/apache/arrow/) doesn't seem to be getting any recent snapshot builds. Could this be established for the sake of easi

[I] [Java] Transfer validity buffer data word at a time (currently we do byte at a time) [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #423: URL: https://github.com/apache/arrow-java/issues/423 We should split and transfer validity buffer contents word at a time. **Reporter**: [Siddharth Teotia](https://issues.apache.org/jira/browse/ARROW-1876) / @siddharthteotia **Note**: *This

[I] [Java] Add Apache Mnemonic (incubating) as alternative allocator for storage-class memory support [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #424: URL: https://github.com/apache/arrow-java/issues/424 **Reporter**: [Wes McKinney](https://issues.apache.org/jira/browse/ARROW-1760) / @wesm PRs and other links: - [GitHub Pull Request apache/arrow#36](https://github.com/apache/arrow/pul

[I] [Java/Python] in-process vector sharing from Java to Python [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #420: URL: https://github.com/apache/arrow-java/issues/420 Currently we seem to use in all applications of Arrow the IPC capabilities to move data between a Java process and a Python process. While this is 0-serialization, it is not zero-copy. By taking the add

[I] [Java] NullableDateMilliVector.getObject() should return a LocalDate, not a LocalDateTime [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #422: URL: https://github.com/apache/arrow-java/issues/422 NullableDateMilliVector.getObject() today returns a LocalDateTime. However, this vector is used to store date information, and thus, getObject() should return a LocalDate. Please note: there alr

[I] [Java] Remove Jackson from compile-time dependencies for arrow-vector [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #418: URL: https://github.com/apache/arrow-java/issues/418 I would like to upgrade Jackson to the latest version (2.9.5). If there are no objections I will create a PR (it is literally just changing the version number in the pom - no code changes required).

[I] [Java/Python] Support VarCharVector / StringArray in pyarrow.Array.from_jvm [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #373: URL: https://github.com/apache/arrow-java/issues/373 Follow-up after https://github.com/apache/arrow/issues/18209: Currently only primitive arrays are supported in `pyarrow.Array.from_jvm` as it uses `pyarrow.Array.from_buffers` underneath. We should exte

[I] [Java] remove FieldReader from ValueVector [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #411: URL: https://github.com/apache/arrow-java/issues/411 Every implementation of ValueVector has an instance of .FieldReader, which has an overhead of 28 bytes on the heap. This can be avoided by instantiating the object only when required. **Reporter*

[I] [Java] Add hasNull flag to Vectors [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #415: URL: https://github.com/apache/arrow-java/issues/415 Add has null flag to Arrow Vector so that for vectors without any null, the null check process should be skipped. **Reporter**: [Yurui Zhou](https://issues.apache.org/jira/browse/ARROW-5198) / @y

[I] [Java] Improving Arrow Vector Reading performance [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #416: URL: https://github.com/apache/arrow-java/issues/416 Currently the read interface of Java Arrow Vector is quite slow because the access operation has to go through validity bit check and boundary check before it can actually load the data. Such a safety c

[I] [JAVA] Leak in JdbcToArrowUtils [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #408: URL: https://github.com/apache/arrow-java/issues/408 `JdbcToArrowUtils::updateVector(VarCharVector, String, boolean, int)` does not release the memory that it allocates for the `NullableVarCharHolder`. This can be verified by changing the first lines of

[I] [Java] Arrow Java can't read union vector from ArrowStreamReader written by its own bugs [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #414: URL: https://github.com/apache/arrow-java/issues/414 When writing union data using  ArrowStreamWriter in java, I can't read it back using ArrowStreamReader in java. The exception is: > Exception in thread "main" java.lang.IllegalArgumentException: not a

[I] [Java] Refactor null slot verification onto a single method in the parent class [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #413: URL: https://github.com/apache/arrow-java/issues/413 After https://github.com/apache/arrow/pull/4288 is checked in there is an opportunity to refactor the code to one place instead of having the same logic across all vector classes. **Reporter**: [

[I] [Java][Flight] grpc-netty, version mismatch, incompatible ctor for "PooledByteBufAllocator" in io.grpc.netty.Utils#createByteBufAllocator [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #347: URL: https://github.com/apache/arrow-java/issues/347 Using Arrow nightly jars from 03/03/2022 ```java val LOCALHOST = "localhost" val allocator = RootAllocator(Long.MAX_VALUE) val serverLocation = Location.forGrpcInsecure(LOCALHOST, 0)

[I] [Java] Ensure JVM has sufficient capacity for large number of local reference [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #410: URL: https://github.com/apache/arrow-java/issues/410 **Reporter**: [Yurui Zhou](https://issues.apache.org/jira/browse/ARROW-5515) / @yuruiz **Note**: *This issue was originally created as [ARROW-5515](https://issues.apache.org/jira/browse/A

[I] [Java] Failed to build document with OpenJDK 11 [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #406: URL: https://github.com/apache/arrow-java/issues/406 It reports the following error: ``` [ERROR] Exit code: 1 - javadoc: error - The code being documented uses modules but the packages defined in http://docs.oracle.com/javase/8/docs/api/ are

[I] [Java] Provide a sample json file for the flight example [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #405: URL: https://github.com/apache/arrow-java/issues/405 The flight package provides IntegrationTestClient and IntegrationTestServer as sample implementations for client/server side. In these implementations, the client sends the content of some json f

[I] [Java] Update README with instructions for IntelliJ users [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #403: URL: https://github.com/apache/arrow-java/issues/403 IntelliJ needs to be configured to use the errorprone compiler and this is not currently documented, making it hard for new contributors to build/test the project. We can pretty much just link to the in

[I] [Java] Bootstrap initial developer documentation in docs/source/developers/java.rst [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #407: URL: https://github.com/apache/arrow-java/issues/407 The project lacks prose documentation about Java development. I propose to begin a section about it in the Sphinx project **Reporter**: [Wes McKinney](https://issues.apache.org/jira/browse/ARROW-

[I] [Java] Implement converter between Arrow record batches and Avro records [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #404: URL: https://github.com/apache/arrow-java/issues/404 It would be useful for applications which need convert Avro data to Arrow data. This is an adapter which convert data with existing API (like JDBC adapter) rather than a native reader (like orc).

[I] [Java] Enable integration tests for dictionaries-within-dictionaries [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #402: URL: https://github.com/apache/arrow-java/issues/402 The integration test is implemented but currently disabled for all implementations **Reporter**: [Wes McKinney](https://issues.apache.org/jira/browse/ARROW-7779) / @wesm Related issues:

[I] [Java][Flight][Tests] Add roundtrip tests for Java Flight Test Client [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #401: URL: https://github.com/apache/arrow-java/issues/401 There should be some built-in roundtrip tests for Java Flight IntegrationTestClient **Reporter**: [Bryan Cutler](https://issues.apache.org/jira/browse/ARROW-7933) / @BryanCutler Related i

[I] [Java] Implement vector diff functionality [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #400: URL: https://github.com/apache/arrow-java/issues/400 In C++ side, we already have array diff functionality for vector equals and testing to make it easy to see differences between Arrays and reduce debugging time.  And it’s better to do something similar

[I] [Java] DenseUnionWriter#setPosition fails with NullPointerException [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #399: URL: https://github.com/apache/arrow-java/issues/399 The writer always iterates through all BaseWriters, and an array of 128 BaseWriters is allocated. So if you do not have 128 typeIds and do not touch all of them, setPosition will give you an exception.

[I] JsonFileWriter fails with NPE after VectorSchemaRoot syncSchema [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #390: URL: https://github.com/apache/arrow-java/issues/390 Hey there :) Repro: - Open VSR - Do something that'll change the `Field` of one of the vectors in the VSR (promotable writer, etc) - `vsr.syncSchema()` - Open JsonFileWriter, `start`,

[I] [Java] Investigate adding a getUnsafe method to vectors [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #396: URL: https://github.com/apache/arrow-java/issues/396 As per: https://github.com/apache/arrow/pull/7095#issuecomment-625579459 **Reporter**: [Ryan Murray](https://issues.apache.org/jira/browse/ARROW-8738) / @rymurr **Note**: *This issue was or

[I] [Java] ArrowMessage failed to parse compressed grpc stream [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #391: URL: https://github.com/apache/arrow-java/issues/391 InvalidProtocolBufferException will be thrown in ArrowMessage.frame if we use gzip compress in Grpc. The reason is stream.available will still be 1 after we read compressed data. There should be

[I] [FlightRPC] Bearer Token refresh design with retry mechanism [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #393: URL: https://github.com/apache/arrow-java/issues/393 - The generated bearer token by implementations of CallHeaderAuthenticator such as BearerTokenAuthenticator are bound to expire. - The expired access token needs to be refreshed. - The FlightClient

[I] Java API get negative messageLength [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #394: URL: https://github.com/apache/arrow-java/issues/394 when I call  ArrowStreamReader.vectorSchemaRoot(),   {{: **(-520103681 < 0)** 2020-11-09 07:09:07,033 ERROR MyModule - Error stack trace java.base/java.nio. Buffer.createCapacityExcept

[I] [Java] Configuration does not provide a mapping for array column [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #392: URL: https://github.com/apache/arrow-java/issues/392 I tried to leverage `org.apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrow` to query a Hive table but got the following error message on array columns.  `Configuration does not provide a mapping fo

[I] [Java] Make compression levels configurable [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #385: URL: https://github.com/apache/arrow-java/issues/385 Today we use default compression levels in compressors, these should be configurable via constructor. **Reporter**: [Micah Kornfield](https://issues.apache.org/jira/browse/ARROW-12163) / @emkornf

[I] [Java] Add API for getBufferSizeFor() with density to BaseVariableWidthVector [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #389: URL: https://github.com/apache/arrow-java/issues/389 Following the discussion on https://github.com/apache/arrow/pull/9187. Proposed API in BaseVariableWidthVector.java: ```java /** * Get the potential buffer size for a particular

[I] [Java] Rename compression classes [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #386: URL: https://github.com/apache/arrow-java/issues/386 Zstd isn't using the commons codec, so we should rename  CommonsCompressionFactory to something more generic, and the existing LZ4 implementation to something potentially more generic. **Reporter*

[I] [Release][Java] Verify staged maven artifacts [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #377: URL: https://github.com/apache/arrow-java/issues/377 We have two tests right now: 1. Execute `mvn test` from the source tarball's java directory testing the source https://github.com/apache/arrow/blob/master/dev/release/verify-release-candidate.sh#L278

[I] [Java] [Benchmarking] Large stdout when running benchmarks [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #378: URL: https://github.com/apache/arrow-java/issues/378 Since ARROW-15058 running the Java benchmarks results in stdout logs that are extremely large (we're seeing them at ~14gb). This is disrupting running benchmarks on conbench. We've temporarily wo

[I] [Java][Dataset] FileSystemDataset: Support reading dictionary arrays [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #382: URL: https://github.com/apache/arrow-java/issues/382 **Reporter**: [Hongze Zhang](https://issues.apache.org/jira/browse/ARROW-12481) / @zhztheplayer **Note**: *This issue was originally created as [ARROW-12481](https://issues.apache.org/jir

[I] [Java] Standardise Logger naming [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #395: URL: https://github.com/apache/arrow-java/issues/395 As per: https://github.com/apache/arrow/pull/7100#discussion_r421884919 We use LOGGER and logger interchangeably and should choose one **Reporter**: [Ryan Murray](https://issues.apache.org/

[I] [Java/Python] Add unit test for pyarrow.timeX types in Array.from_jvm [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #375: URL: https://github.com/apache/arrow-java/issues/375 Follow-up after https://github.com/apache/arrow/issues/18209 as we are missing the necessary methods to construct these arrays conveniently on the Python side. Once there is a path to construct `

[I] [Java][Gandiva] Expose Dremio build and tests as new optional container/test [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #365: URL: https://github.com/apache/arrow-java/issues/365 Dremio uses Arrow Java and Gandiva extensively and could provide additional test coverage for the project. We should find a way to expose the downstream build of Dremio as an optional build so major cha

[I] [Java] Port Row Set abstraction from Drill to Arrow [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #372: URL: https://github.com/apache/arrow-java/issues/372 Arrow is a great way to exchange data between systems. Somewhere in the process, however, data must be load into, and read out of the Arrow vectors. Arrow's vector code started with similar code i

[I] [Java] VectorSchemaRoot.create(schema, allocator) doesn't create dictionary encoded vector correctly [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #369: URL: https://github.com/apache/arrow-java/issues/369 **Reporter**: [Li Jin](https://issues.apache.org/jira/browse/ARROW-3396) / @icexelloss PRs and other links: - [GitHub Pull Request apache/arrow#2681](https://github.com/apache/arrow/p

[I] [Java] Arrow-to-JDBC [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #363: URL: https://github.com/apache/arrow-java/issues/363 ARROW-1780 reads a query from a JDBC data source and converts the ResultSet to an Arrow VectorSchemaRoot.  However, there is no built-in adapter for writing an Arrow VectorSchemaRoot back to the databas

[I] [Java] Optimize bit operations performance [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #368: URL: https://github.com/apache/arrow-java/issues/368 From @animeshtrivedi's benchmark finding: 2) Materialize values from Validity and Value direct buffers instead of calling getInt() function on the IntVector. This is implemented as a new Unsa

[I] [Gandiva][Java] Safeguard jvm before loading the gandiva library [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #364: URL: https://github.com/apache/arrow-java/issues/364 Today we load the gandiva library always when trying to use the jni bridge, but we have run into issues causing the jvm to crash in untested paths. Proposal is to do load the library in a separate

[I] [Java] Make JniWrapper native method be public [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #360: URL: https://github.com/apache/arrow-java/issues/360 The goal is to integrate Gandiva into apache Drill project. Now drill and arrow has some differences at the column in memory representation. Drill has a 2.0 plan to integrate arrow. Now I want to do som

[I] [Java] FuzzIpcStream: Uncaught exception in java.base/java.nio.Buffer.createCapacityException [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #355: URL: https://github.com/apache/arrow-java/issues/355 Detailed Report: https://oss-fuzz.com/testcase?key=5095153130405888 Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcStream Job Type: libfuzzer_asan_arrow-java Platform

[I] [Flight][Java][C++] Data read through Flight is having endianness issue on s390x [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #352: URL: https://github.com/apache/arrow-java/issues/352 Am facing an endianness issue on s390x(big endian) when converting the data read through flight to pandas data frame. (1) table.validate() fails with error ```Java Traceback (most recent

[I] [Java] FuzzIpcFile: Uncaught exception in java.base/java.nio.HeapByteBuffer. [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #358: URL: https://github.com/apache/arrow-java/issues/358 Detailed Report: https://oss-fuzz.com/testcase?key=5015797066498048 Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcFile Job Type: libfuzzer_asan_arrow-java Platform I

[I] [Java] FuzzIpcFile: Uncaught exception in java.base/java.nio.Buffer.checkIndex [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #357: URL: https://github.com/apache/arrow-java/issues/357 Detailed Report: https://oss-fuzz.com/testcase?key=5518211972464640 Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcFile Job Type: libfuzzer_asan_arrow-java Platform I

[I] [Java] FuzzIpcFile: Uncaught exception in org.apache.arrow.vector.types.pojo.Schema.convertSchema [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #356: URL: https://github.com/apache/arrow-java/issues/356 Detailed Report: https://oss-fuzz.com/testcase?key=5965184743636992 Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcFile Job Type: libfuzzer_asan_arrow-java Platform I

[I] [Docs][Java] Dataset Javadocs are not being published [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #351: URL: https://github.com/apache/arrow-java/issues/351 The Javadocs for 7.0.0 don't list org.apache.arrow.dataset: https://arrow.apache.org/docs/java/reference/index.html **Reporter**: [David Li](https://issues.apache.org/jira/browse/ARROW-15702) / @

[I] [C++/Java] Error when reading inner lists within a struct in empty outer lists from C++/Python in Java [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #343: URL: https://github.com/apache/arrow-java/issues/343 When using C++ (or Python) to construct a null or empty outer array of type **array_1: list>>**, either: ``` - array_1: null - array_1: [] ``` an out of bounds exceptions

[I] [Java] FuzzIpcStream: Uncaught exception in java.base/java.nio.Bits.reserveMemory [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #353: URL: https://github.com/apache/arrow-java/issues/353 Detailed Report: Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcStream Job Type: libfuzzer_asan_arrow-java Platfo

[I] [Java] FuzzIpcStream: Uncaught exception in org.apache.arrow.memory.BaseAllocator.buffer [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #354: URL: https://github.com/apache/arrow-java/issues/354 Detailed Report: https://oss-fuzz.com/testcase?key=6427573486223360 Project: arrow-java Fuzzing Engine: libFuzzer Fuzz Target: FuzzIpcStream Job Type: libfuzzer_asan_arrow-java Platform

[I] ListVector#getBuffers returns buffers in the wrong order for usage of VectorLoader [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #341: URL: https://github.com/apache/arrow-java/issues/341 Calling [getBuffers](https://github.com/apache/arrow/blob/06ca00c2daeeb0d6461e7b6bec51679c19b5b92b/java/vector/src/main/java/org/apache/arrow/vector/complex/ListVector.java#L652-L653)  on a ListVector ad

[I] [Java] Spark job fails due to arrow buf limitation [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #342: URL: https://github.com/apache/arrow-java/issues/342   Hello, Groupby + applyinPandas results in following error. We need some parameter to tune buffer size.   ```java Caused by: java.lang.IndexOutOfBoundsException: index: 0,

[I] [Java] Problem to get current reservation of JVM direct memory on macOS Big Sur [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #349: URL: https://github.com/apache/arrow-java/issues/349 Hi Team, Just compiling [java dataset ](https://github.com/apache/arrow/tree/master/java/dataset)module it compile without problems but the test related to  get current reservation of JVM direct

[I] [FlightRPC][FlightSQL][Java] Make FlightSqlClientDemo a general purpose tool [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #348: URL: https://github.com/apache/arrow-java/issues/348 The FlightSqlClientDemo test program has a limited set of options that prevents it from being used against production databases. It does work against the Derby and SQLite examples though. The maj

[I] [FlightRPC][Java] CallbackBackpressureStrategy should not rely on listener.isReady() [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #346: URL: https://github.com/apache/arrow-java/issues/346 According to the spec for , we can get into a state where the ready flag on the list

[I] [IPC][Java] JDK 8 incompatibility with ByteBuffer.clear() [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #344: URL: https://github.com/apache/arrow-java/issues/344 There is an incompatibility with JDK 8 when Arrow is compiled with JDK 9 or higher as described here: This pattern is used in at least MessageSer

[I] [Java] Add VarCharWriter#write(String) [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #345: URL: https://github.com/apache/arrow-java/issues/345 For convenience, since the default method requires allocating an ArrowBuf and such. See [PR#11982 (comment)](https://github.com/apache/arrow/pull/11982#discussion_r814226648) It's generated [in

[I] [Java] java.lang.reflect.InaccessibleObjectException on Java 18 [arrow-java]

2024-11-27 Thread via GitHub
asfimport opened a new issue, #340: URL: https://github.com/apache/arrow-java/issues/340 Getting the following stack trace when running on Java 18. `BaseAllocator` throws this when it calls the `DefaultAllocationManagerFactory`. ```Java private ArrowBuf createEmpty()

Re: [I] [R][CI] Nightly job failures with `Failed to install qpdf` [arrow]

2024-11-27 Thread via GitHub
assignUser closed issue #44841: [R][CI] Nightly job failures with `Failed to install qpdf` URL: https://github.com/apache/arrow/issues/44841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[I] [Python] Support lists of sources and destinations on `pyarrow.fs.copy_files()` [arrow]

2024-11-27 Thread via GitHub
Tom-Newton opened a new issue, #44864: URL: https://github.com/apache/arrow/issues/44864 ### Describe the enhancement requested I have a usecase where we want to copy a list of files from source to destination, and they are not all in a directory. For example: ``` pyarro