Re: [I] BUG: segmentation faults in the presence of `sparse` optional dependency (within conda builds) [arrow]

2024-07-23 Thread via GitHub


h-vetinari closed issue #15018: BUG: segmentation faults in the presence of 
`sparse` optional dependency (within conda builds)
URL: https://github.com/apache/arrow/issues/15018


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++] Test linkage error when googletest 1.15.0 is installed system wide despite bundling [arrow]

2024-07-23 Thread via GitHub


amoeba opened a new issue, #43400:
URL: https://github.com/apache/arrow/issues/43400

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   On macOS 14.5, I happened to upgrade my brew version of googletest from 
1.14.0 to 1.15.0 and started seeing a test linkage error:
   
   My cmake command is:
   
   ```
   cmake .. -GNinja -DARROW_ACERO=ON -DARROW_COMPUTE=ON -DARROW_CSV=ON \
 -DARROW_DATASET=ON -DARROW_FILESYSTEM=ON -DARROW_FLIGHT=ON -DARROW_JSON=ON 
\
 -DARROW_PARQUET=ON -DARROW_AZURE=ON -DARROW_S3=ON -DARROW_GCS=ON \
 -DARROW_SUBSTRAIT=ON -DARROW_BUILD_TESTS=ON -DARROW_MIMALLOC=OFF \
 -DARROW_WITH_BROTLI=ON -DARROW_WITH_BZ2=ON -DARROW_WITH_LZ4=ON \
 -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON -DARROW_WITH_ZSTD=ON \
 -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_EXTRA_ERROR_CONTEXT=ON\
 -DCMAKE_INSTALL_PREFIX=/Users/bryce/builds/arrow-arm64 
-DCMAKE_BUILD_TYPE=Debug \
 -DGTest_SOURCE=BUNDLED -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
   ```
   
   When compiling, I get two linker errors (both similar to this one):
   
   ```
   FAILED: debug/arrow-flight-test
   : && 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
 -fno-aligned-new  -Qunused-arguments -fcolor-diagnostics  -Wall -Wextra 
-Wdocumentation -DARROW_WARN_DOCUMENTATION -Wshorten-64-to-32 
-Wno-missing-braces -Wno-unused-parameter -Wno-constant-logical-operand 
-Wno-return-stack-address -Wdate-time -Wno-unknown-warning-option 
-Wno-pass-failed -march=armv8-a  -g -Werror -O0 -ggdb  -arch arm64 -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk
 -Wl,-search_paths_first -Wl,-headerpad_max_install_names  
src/arrow/flight/CMakeFiles/arrow-flight-test.dir/flight_test.cc.o -o 
debug/arrow-flight-test  
-Wl,-rpath,/Users/bryce/src/apache/arrow/cpp/build/debug 
-Wl,-rpath,/opt/homebrew/lib  debug/libarrow_flight_testing.1800.0.0.dylib  
debug/libarrow_testing.1800.0.0.dylib  debug/libarrow_gmockd.1.11.0.dylib  
debug/libarrow_gtest_maind.1.11.0.dylib  debug/libarrow_flight.1800.0.0.dylib  
/opt/homebr
 ew/lib/libgrpc++.1.62.2.dylib  /opt/homebrew/lib/libgrpc.39.0.0.dylib  
/opt/homebrew/lib/libupb_json_lib.39.0.0.dylib  
/opt/homebrew/lib/libupb_textformat_lib.39.0.0.dylib  
/opt/homebrew/lib/libupb_message_lib.39.0.0.dylib  
/opt/homebrew/lib/libupb_base_lib.39.0.0.dylib  
/opt/homebrew/lib/libupb_mem_lib.39.0.0.dylib  
/opt/homebrew/lib/libutf8_range_lib.39.0.0.dylib  
/opt/homebrew/lib/libre2.11.0.0.dylib  
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk/usr/lib/libz.tbd
  /opt/homebrew/lib/libcares.2.17.2.dylib  -lresolv  
/opt/homebrew/lib/libgpr.39.0.0.dylib  
/opt/homebrew/opt/openssl@3/lib/libssl.dylib  
/opt/homebrew/opt/openssl@3/lib/libcrypto.dylib  
/opt/homebrew/lib/libaddress_sorting.39.0.0.dylib  -lm  -framework 
CoreFoundation  /opt/homebrew/lib/libprotobuf.27.1.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_check_op.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_leak_check.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_die_if_null.2401.
 0.0.dylib  /opt/homebrew/lib/libabsl_log_internal_conditions.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_message.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_nullguard.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_examine_stack.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_format.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_proto.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_log_sink_set.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_sink.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_entry.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_internal.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_marshalling.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_reflection.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_config.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_program_name.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_private_handle_accessor.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_flags_commandlineflag.2401.0.0.dylib  /opt/homebrew/
 lib/libabsl_flags_commandlineflag_internal.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_initialize.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_globals.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_globals.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_vlog_config_internal.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_log_internal_fnmatch.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_raw_hash_set.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_hash.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_city.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_low_level_hash.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_hashtablez_sampler.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_random_distributions.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_random_seed_sequences.2401.0.0.dylib  
/opt/homebrew/lib/libabsl_random_in

[I] Add CI jobs for windows aarch64 [arrow]

2024-07-23 Thread via GitHub


jonkeane opened a new issue, #43401:
URL: https://github.com/apache/arrow/issues/43401

   ### Describe the enhancement requested
   
   [R 4.4 has experimental support for windows on 
aarch64](https://blog.r-project.org/2024/04/23/r-on-64-bit-arm-windows/index.html)
 we should setup a CI job to confirm that the arrow package builds here.
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] Remove use of jsr305 [arrow]

2024-07-23 Thread via GitHub


lidavidm closed issue #43396: [Java] Remove use of jsr305
URL: https://github.com/apache/arrow/issues/43396


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] c/driver/postgresql: Connection.adbc_get_table_schema() does not respect column order [arrow-adbc]

2024-07-23 Thread via GitHub


lidavidm closed issue #2006: c/driver/postgresql: 
Connection.adbc_get_table_schema() does not respect column order
URL: https://github.com/apache/arrow-adbc/issues/2006


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] c/driver/postgresql: Connection.adbc_get_table_schema() does not respect column order [arrow-adbc]

2024-07-23 Thread via GitHub


lidavidm closed issue #2006: c/driver/postgresql: 
Connection.adbc_get_table_schema() does not respect column order
URL: https://github.com/apache/arrow-adbc/issues/2006


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] The Go flightsql driver doesn't handle scanning LargeString or LargeBinary types [arrow]

2024-07-23 Thread via GitHub


phillipleblanc opened a new issue, #43403:
URL: https://github.com/apache/arrow/issues/43403

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The Go flightsql database driver currently does not handle scanning string 
or byte values where the Arrow type is LargeString or LargeBinary.
   
   This code will fail with `type *array.LargeString: not supported` today
   
   ```go
   db, err := sql.Open("flightsql", "flightsql://some_endpoint")
   if err != nil {
panic(err)
   }
   defer db.Close()
   row := db.QueryRow("SELECT string_value FROM table LIMIT 1")
   
   log.Println("Reading row")
   var string_value string
   if err := rows.Scan(&string_value); err != nil {
 // If `string_value` is a LargeString type, this will error
   }
   ```
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Progress bar for `read_feather` for `R` and a verbose version [arrow]

2024-07-23 Thread via GitHub


ajinkya-k opened a new issue, #43404:
URL: https://github.com/apache/arrow/issues/43404

   ### Describe the enhancement requested
   
   I would like to request a that a progress bar be shown when using 
`read_feather` function in `R` especially for large files, so that the user can 
see if the file is actually being read and progress is being read, similar to 
`data.table::fread` which shows a simple progress bar enabled using the 
`showProgress` argument in `fread`. I have a use case in which I am using 
`read_feather` to read a large file into `R` from a network drive, and there is 
no indication if `R` is even making progress on loading the file during some 
runs. In others it loads in ~300 seconds. `fread` also has a verbose option 
which dumps a lot more output, and would also be well worth implementing, but a 
progress bar at minimum would also be great! 
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Could not read encrypted metadata via pq.read_table [arrow]

2024-07-23 Thread via GitHub


heyuqi1970 opened a new issue, #43406:
URL: https://github.com/apache/arrow/issues/43406

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   os: macos 11.7.10 (20G1427)
   python: 3.9.7
   pyarrow: 16.0.0
   
   when I use pq.read_table with decryption_properties parameter,  I get the 
following error。
   And I can use pq.ParquetFile with decryption_properties to read same 
encrypted file.
   
   ```
   Traceback (most recent call last):
 File "tt.py", line 98, in 
   table = pq.read_table("yellow_cryp.parquet", memory_map=True, 
decryption_properties=decryption_properties)
 File "/venv/lib/python3.9/site-packages/pyarrow/parquet/core.py", line 
1762, in read_table
   dataset = ParquetDataset(
 File "/venv/lib/python3.9/site-packages/pyarrow/parquet/core.py", line 
1329, in __init__
   [fragment], schema=schema or fragment.physical_schema,
 File "pyarrow/_dataset.pyx", line 1431, in 
pyarrow._dataset.Fragment.physical_schema.__get__
 File "pyarrow/error.pxi", line 154, in 
pyarrow.lib.pyarrow_internal_check_status
 File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
   OSError: Could not open Parquet input source 'yellow_cryp.parquet': Could 
not read encrypted metadata, no decryption found in reader's properties
   
   ```
   
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++] IO: InputStream::Advance will always read from Stream [arrow]

2024-07-24 Thread via GitHub


mapleFU opened a new issue, #43408:
URL: https://github.com/apache/arrow/issues/43408

   ### Describe the enhancement requested
   
   ```c++
   class ARROW_EXPORT InputStream : virtual public FileInterface, virtual 
public Readable {
public:
 /// \brief Advance or skip stream indicated number of bytes
 /// \param[in] nbytes the number to move forward
 /// \return Status
 Status Advance(int64_t nbytes);
   ```
   
   ```c++
   Status InputStream::Advance(int64_t nbytes) { return Read(nbytes).status(); }
   ```
   
   `Advance` is always call read, since it's not a virtual function
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java][Benchmarking] Java benchmarks are broken when running with Java 17+ [arrow]

2024-07-24 Thread via GitHub


danepitkin closed issue #43394: [Java][Benchmarking] Java benchmarks are broken 
when running with Java 17+
URL: https://github.com/apache/arrow/issues/43394


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [R] r-arrow cannot be compiled with clang [arrow]

2024-07-24 Thread via GitHub


assignUser closed issue #43398: [R] r-arrow cannot be compiled with clang
URL: https://github.com/apache/arrow/issues/43398


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Python]: Support PyCapsule Interface Objects as input in more places [arrow]

2024-07-24 Thread via GitHub


kylebarron opened a new issue, #43410:
URL: https://github.com/apache/arrow/issues/43410

   ### Describe the enhancement requested
   
   Now that the PyCapsule Interface is starting to gain more traction 
(https://github.com/apache/arrow/issues/39195), I think it would be great if 
some of pyarrow's functional APIs accepted any PyCapsule Interface object, and 
not _just_ pyarrow objects. 
   
   Do people have opinions on what functions should or should not check for 
these objects? I'd argue that file format writers should check for them, 
because it's only a couple lines of code, and the input stream will be fully 
iterated over regardless. E.g. looking at the Parquet writer: the high level 
API doesn't currently accept a `RecordBatchReader` either, so support for both 
can come at the same time.
   
   ```py
   from dataclasses import dataclass
   from typing import Any
   
   import pyarrow as pa
   import pyarrow.parquet as pq
   
   
   @dataclass
   class ArrowCStream:
   obj: Any
   
   def __arrow_c_stream__(self, requested_schema=None):
   return self.obj.__arrow_c_stream__(requested_schema=requested_schema)
   
   
   table = pa.table({"a": [1, 2, 3, 4]})
   pq.write_table(table, "test.parquet")  # works
   
   reader = pa.RecordBatchReader.from_stream(table)
   pq.write_table(reader, "test.parquet")  # fails
   pq.write_table(ArrowCStream(table), "test.parquet")  # fails
   ```
   
   I'd argue that the writer should be generalized to accept any object with an 
`__arrow_c_stream__` dunder, and to ensure the stream is not materialized as a 
table.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java][Benchmarking] Java benchmarks are still broken when running with Java 17+ [arrow]

2024-07-24 Thread via GitHub


danepitkin opened a new issue, #43412:
URL: https://github.com/apache/arrow/issues/43412

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We didn't do the right testing and it turns out that this merged PR does not 
fully fix the Java benchmarks: https://github.com/apache/arrow/pull/43395
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Compute] Invalid memory access when resizing row table [arrow]

2024-07-24 Thread via GitHub


zanmato1984 opened a new issue, #43414:
URL: https://github.com/apache/arrow/issues/43414

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When resizing the underlying buffer for the var-length content of the row 
table, we do:
   
https://github.com/apache/arrow/blob/674e221f41c602c8f71c7a2c8e53e7c7c11b1ede/cpp/src/arrow/compute/row/row_internal.cc#L296-L299
   
   It is treating the second buffer (row content if the row table is fixed 
length, or offset otherwise) as offset regardless of the fix-length-ness. The 
fix-length-ness is checked afterwards, in which case resizing the var-length 
buffer is unnecessary and return.
   
   But treating the second buffer as offset unconditionally is problematic 
because, at least but not last, it could be sized less than required by an 
offset buffer. Consider a row table containing only one `uint8` column and 
alignment being `1` byte, there will be `1` byte per row, less than `4` bytes 
per row as an offset, causing the offset access beyond the buffer boundary.
   
   I have a repro case in my local and will send out as UT with my fix PR.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [CI][C++] Vcpkg failures building some wheels [arrow]

2024-07-25 Thread via GitHub


raulcd opened a new issue, #43416:
URL: https://github.com/apache/arrow/issues/43416

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Some wheels have been failing for some days due to vcpkg failures:
   
   
[wheel-macos-big-sur-cp310-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073275167/job/27846965111)
   
[wheel-macos-big-sur-cp311-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073273602/job/27846957637)
   
[wheel-macos-big-sur-cp312-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073274108/job/27846960846)
   
[wheel-macos-big-sur-cp38-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073275488/job/27846966986)
   
[wheel-macos-big-sur-cp39-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073275367/job/27846966545)
   
[wheel-macos-catalina-cp310-amd64](https://github.com/ursacomputing/crossbow/actions/runs/10073273694/job/27846958347)
   
[wheel-macos-catalina-cp311-amd64](https://github.com/ursacomputing/crossbow/actions/runs/10073273570/job/27846957447)
   
[wheel-macos-catalina-cp312-amd64](https://github.com/ursacomputing/crossbow/actions/runs/10073274381/job/27846961100)
   
[wheel-macos-catalina-cp38-amd64](https://github.com/ursacomputing/crossbow/actions/runs/10073273679/job/27846958339)
   
[wheel-macos-catalina-cp39-amd64](https://github.com/ursacomputing/crossbow/actions/runs/10073275056/job/27846964797)
   
[wheel-manylinux-2014-cp312-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073274856/job/27846963508)
   
[wheel-manylinux-2014-cp38-arm64](https://github.com/ursacomputing/crossbow/actions/runs/10073273811/job/27846959167)
   
   An example of failure
   
   ```

/Users/runner/work/crossbow/crossbow/arrow/ci/vcpkg/arm64-osx-static-release.cmake:
 info: loaded overlay triplet from here
   -- Downloading 
https://github.com/abseil/abseil-cpp/archive/20240116.2.tar.gz -> 
abseil-abseil-cpp-20240116.2.tar.gz...
   -- Extracting source 
/Users/runner/work/crossbow/crossbow/vcpkg/downloads/abseil-abseil-cpp-20240116.2.tar.gz
   -- Using source at 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/src/20240116.2-eaa4a5f5c0.clean
   -- Found external ninja('1.12.1').
   -- Configuring arm64-osx-static-release-rel
   CMake Error at scripts/cmake/vcpkg_execute_required_process.cmake:112 
(message):
   Command failed: /opt/homebrew/Cellar/cmake/3.30.0/bin/cmake 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/src/20240116.2-eaa4a5f5c0.clean
 -G Ninja -DCMAKE_BUILD_TYPE=Release 
-DCMAKE_INSTALL_PREFIX=/Users/runner/work/crossbow/crossbow/vcpkg/packages/abseil_arm64-osx-static-release
 -DFETCHCONTENT_FULLY_DISCONNECTED=ON -DABSL_PROPAGATE_CXX_STD=ON 
-DCMAKE_MAKE_PROGRAM=/opt/homebrew/bin/ninja -DCMAKE_SYSTEM_NAME=Darwin 
-DBUILD_SHARED_LIBS=OFF 
-DVCPKG_CHAINLOAD_TOOLCHAIN_FILE=/Users/runner/work/crossbow/crossbow/vcpkg/scripts/toolchains/osx.cmake
 -DVCPKG_TARGET_TRIPLET=arm64-osx-static-release -DVCPKG_SET_CHARSET_FLAG=ON 
-DVCPKG_PLATFORM_TOOLSET=external -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON 
-DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON 
-DCMAKE_FIND_PACKAGE_NO_SYSTEM_PACKAGE_REGISTRY=ON 
-DCMAKE_INSTALL_SYSTEM_RUNTIME_LIBS_SKIP=TRUE -DCMAKE_VERBOSE_MAKEFILE=ON 
-DVCPKG_APPLOCAL_DEPS=OFF 
-DCMAKE_TOOLCHAIN_FILE=/Users/runner/work/crossbow/crossbow/vcpkg/scripts/buildsystems
 /vcpkg.cmake -DCMAKE_ERROR_ON_ABSOLUTE_INSTALL_DESTINATION=ON 
-DVCPKG_CXX_FLAGS= -DVCPKG_CXX_FLAGS_RELEASE= -DVCPKG_CXX_FLAGS_DEBUG= 
-DVCPKG_C_FLAGS= -DVCPKG_C_FLAGS_RELEASE= -DVCPKG_C_FLAGS_DEBUG= 
-DVCPKG_CRT_LINKAGE=dynamic -DVCPKG_LINKER_FLAGS= -DVCPKG_LINKER_FLAGS_RELEASE= 
-DVCPKG_LINKER_FLAGS_DEBUG= -DVCPKG_TARGET_ARCHITECTURE=arm64 
-DCMAKE_INSTALL_LIBDIR:STRING=lib -DCMAKE_INSTALL_BINDIR:STRING=bin 
-D_VCPKG_ROOT_DIR=/Users/runner/work/crossbow/crossbow/vcpkg 
-D_VCPKG_INSTALLED_DIR=/Users/runner/work/crossbow/crossbow/vcpkg/installed 
-DVCPKG_MANIFEST_INSTALL=OFF -DCMAKE_OSX_DEPLOYMENT_TARGET=11.0 
-DCMAKE_OSX_ARCHITECTURES=arm64
   Working Directory: 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/arm64-osx-static-release-rel
   Error code: 1
   See logs for more information:
 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/config-arm64-osx-static-release-rel-CMakeCache.txt.log
 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/config-arm64-osx-static-release-rel-out.log
 
/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/abseil/config-arm64-osx-static-release-rel-err.log
   
   Call Stack (most recent call first):
 installed/arm64-osx/share/vcpkg-cmake/vcpkg_cmake_configure.cmake:280 
(vcpkg_execute_required_process)
 ports/abseil/portfile.cmake:26 (vcpkg_cmake_configure)
 scripts/ports.cmake:175 (include)
   
   
   error: building abseil:arm64-osx-static-release failed with: BUILD_FAILED
   Elapsed time to handle abseil:arm64-osx-

Re: [I] [C++] Use signed offset type in row table related structures [arrow]

2024-07-25 Thread via GitHub


zanmato1984 closed issue #40020: [C++] Use signed offset type in row table 
related structures
URL: https://github.com/apache/arrow/issues/40020


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java][Benchmarking] Java benchmarks are still broken when running with Java 17+ [arrow]

2024-07-25 Thread via GitHub


danepitkin closed issue #43412: [Java][Benchmarking] Java benchmarks are still 
broken when running with Java 17+
URL: https://github.com/apache/arrow/issues/43412


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] Add support for JDK version cross testing [arrow]

2024-07-25 Thread via GitHub


danepitkin closed issue #43380: [Java] Add support for JDK version cross testing
URL: https://github.com/apache/arrow/issues/43380


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++] Tests for the 'take' function don't exercise kernels handling chunked arrays very well [arrow]

2024-07-25 Thread via GitHub


felipecrv closed issue #43291: [C++] Tests for the 'take' function don't 
exercise kernels handling chunked arrays very well
URL: https://github.com/apache/arrow/issues/43291


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python] Test FlightStreamReader iterator [arrow]

2024-07-25 Thread via GitHub


danepitkin closed issue #42085: [Python] Test FlightStreamReader iterator
URL: https://github.com/apache/arrow/issues/42085


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Swift] Add StructArray to ArrowReader [arrow]

2024-07-25 Thread via GitHub


kou closed issue #43169: [Swift] Add StructArray to ArrowReader
URL: https://github.com/apache/arrow/issues/43169


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] go/adbc/driver/flightsql: long delay between createPreparedStatement and getFlightInfoPreparedStatement [arrow-adbc]

2024-07-25 Thread via GitHub


aiguofer opened a new issue, #2040:
URL: https://github.com/apache/arrow-adbc/issues/2040

   ### What happened?
   
   We have a Python API that uses the ADBC driver to execute queries against 
our Java Arrow Flight SQL server. The Python server uses FastAPI + Strawberry, 
and when we receive a request to execute a query, we spin up a background 
thread to handle the execution against the AFS server. Multiple threads on the 
same pod could be executing queries against the AFS server at any given moment. 
   
   We recently noticed some issues with hanging queries, and when looking at 
our DataDog traces, we notice that there is almost a 30 minute difference 
between the `createPreparedStatement` request and the 
`getFlightInfoPreparedStatement` request.
   
   My initial guess is that this could be related to having multiple requests 
at the same time through the ADBC driver, but I don't have enough context about 
how the bindings between Go and Python work.
   
   Is there anything that jumps out at you? Is there anything we could do to 
help debug this? Here's pictures of the traces:
   
   ![Screenshot 2024-07-25 at 2 53 31β€―
PM](https://github.com/user-attachments/assets/690bb99e-8a6a-484d-8a68-5b9fbfed1012)
   ![Screenshot 2024-07-25 at 2 53 09β€―
PM](https://github.com/user-attachments/assets/c8cbb194-8f3d-4c5e-a0f0-84d2d4016f49)
   
   
   ### Stack Trace
   
   _No response_
   
   ### How can we reproduce the bug?
   
   _No response_
   
   ### Environment/Setup
   
   Python 3.12
   ADBC FlightSQL driver 1.0.0
   ADBC driver manager 1.0.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Use Amazon KMS for encryption having error: "OSError: Incorrect key to columns mapping in column keys property:" [arrow]

2024-07-25 Thread via GitHub


ChanTheDataExplorer opened a new issue, #43426:
URL: https://github.com/apache/arrow/issues/43426

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   Hi, I'm trying to use the columnar encryption but having problem to make it 
work. Below is the sample code
   
   ```
   KMS_KEY_ARN= 
'arn:aws:kms:ap-southeast-1:643458469770:key/4c9195a3-bb54-40d3-b199-9f9bf6ea9dcf'
   FOOTER_KEY_NAME = "footer_key"
   COL_KEY_NAME = "column_key"
   
   table = pa.Table.from_pydict({
   'a': ['hello'],
   'b': ['goodbye'],
   'c': ['womp']
   })
   
   encryption_config = pe.EncryptionConfiguration(
   footer_key=KMS_KEY_ARN,
   column_keys={
   KMS_KEY_ARN: ["a", "b", "c"],
   },
   encryption_algorithm="AES_GCM_V1",
   cache_lifetime=timedelta(minutes=5.0),
   data_key_length_bits=256
   )
   
   kms_connection_config = pe.KmsConnectionConfig(
   custom_kms_conf={
   FOOTER_KEY_NAME: FOOTER_KEY_ARN,
   COL_KEY_NAME: FOOTER_KEY_ARN,
   }
   )
   
   crypto_factory = pe.CryptoFactory(kms_factory)
   
   file_encryption_properties = crypto_factory.file_encryption_properties(
   kms_connection_config, encryption_config)
   
   with pq.ParquetWriter(path, table.schema, 
encryption_properties=file_encryption_properties) as writer:
   writer.write_table(table)
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [CI] Add wheels and java-jars to vcpkg group tasks [arrow]

2024-07-25 Thread via GitHub


assignUser closed issue #43418: [CI] Add wheels and java-jars to vcpkg group 
tasks
URL: https://github.com/apache/arrow/issues/43418


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Snowflake adbc_ingest reverting back to CSV uploading [arrow-adbc]

2024-07-25 Thread via GitHub


davlee1972 opened a new issue, #2041:
URL: https://github.com/apache/arrow-adbc/issues/2041

   ### What happened?
   
   I'll have more time to debug this next week, but adbc_ingest() is creating 
CSV files with Snowflake.
   
   I'm using the latest 1.1.0 ADBC drivers.
   
   Even when I convert my CSV files to Parquet and try calling adbc_ingest() it 
is sending data to Snowflake in CSV format..
   
   I'll have to downgrade my drivers for further testing..
   
   
![image](https://github.com/user-attachments/assets/58ef95d9-a5be-4b82-9ead-67e0775a2ce3)
   
   
   ### Stack Trace
   
   _No response_
   
   ### How can we reproduce the bug?
   
   _No response_
   
   ### Environment/Setup
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Parquet] Deprecate ColumnChunk::file_offset field [arrow]

2024-07-25 Thread via GitHub


mapleFU opened a new issue, #43427:
URL: https://github.com/apache/arrow/issues/43427

   ### Describe the enhancement requested
   
   https://github.com/apache/parquet-format/pull/440
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] Rows will intermittently return BigInt data, instead of the expected strings [arrow]

2024-07-26 Thread via GitHub


Vectorrent closed issue #43275: [JS] Rows will intermittently return BigInt 
data, instead of the expected strings
URL: https://github.com/apache/arrow/issues/43275


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][FlightRPC] Flight UCX build is failing [arrow]

2024-07-26 Thread via GitHub


felipecrv opened a new issue, #43429:
URL: https://github.com/apache/arrow/issues/43429

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   ```
   ~/code/arrow/cpp/src/arrow/flight/transport/ucx/ucx_server.cc:261:33: error: 
implicit conversion loses integer precision: 'unsigned long' to 'socklen_t' 
(aka 'unsigned int') [-Werror,-Wshorten-64-to-32]
 params.sockaddr.addrlen = addrlen;
 ~ ^~~
   ~/code/arrow/cpp/src/arrow/flight/transport/ucx/ucx_server.cc:379:40: error: 
no member named 'DoSerializeToString' in 'arrow::flight::FlightInfo'; did you 
mean 'SerializeToString'?
   SERVER_RETURN_NOT_OK(driver, info->DoSerializeToString(&response));
  ^~~
  SerializeToString
   ~/code/arrow/cpp/src/arrow/flight/transport/ucx/ucx_server.cc:55:26: note: 
expanded from macro 'SERVER_RETURN_NOT_OK'
   ::arrow::Status s = (status);
\
^
   ~/code/arrow/cpp/src/arrow/flight/types.h:540:17: note: 'SerializeToString' 
declared here
 arrow::Status SerializeToString(std::string* out) const;
   ^
   ~/code/arrow/cpp/src/arrow/flight/transport/ucx/ucx_server.cc:400:40: error: 
no member named 'DoSerializeToString' in 'arrow::flight::PollInfo'; did you 
mean 'SerializeToString'?
   SERVER_RETURN_NOT_OK(driver, info->DoSerializeToString(&response));
  ^~~
  SerializeToString
   ~/code/arrow/cpp/src/arrow/flight/transport/ucx/ucx_server.cc:55:26: note: 
expanded from macro 'SERVER_RETURN_NOT_OK'
   ::arrow::Status s = (status);
\
^
   ~/code/arrow/cpp/src/arrow/flight/types.h:619:17: note: 'SerializeToString' 
declared here
 arrow::Status SerializeToString(std::string* out) const;
   ^
   3 errors generated.
   ```
   
   ### Component(s)
   
   C++, FlightRPC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java][CI] Java-Jars CI is Failing with a linking error on macOS [arrow]

2024-07-26 Thread via GitHub


danepitkin closed issue #43377: [Java][CI] Java-Jars CI is Failing with a 
linking error on macOS
URL: https://github.com/apache/arrow/issues/43377


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java][Packaging] java-jars failing on maven module [arrow]

2024-07-26 Thread via GitHub


danepitkin opened a new issue, #43432:
URL: https://github.com/apache/arrow/issues/43432

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   java-jars job is failing: 
https://github.com/ursacomputing/crossbow/actions/runs/10103282092/job/27945082133
   
   ```
   [FATAL] Non-readable POM 
/Users/runner/work/crossbow/crossbow/arrow/java/maven: 
/Users/runner/work/crossbow/crossbow/arrow/java/maven (No such file or 
directory) @ 
   ```
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] Build warning due to missing arrow2csv.cjs file in bin [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #42229: [JS] Build warning due to missing arrow2csv.cjs 
file in bin
URL: https://github.com/apache/arrow/issues/42229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] Build warning due to missing arrow2csv.cjs file in bin [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #42229: [JS] Build warning due to missing arrow2csv.cjs 
file in bin
URL: https://github.com/apache/arrow/issues/42229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] Build fails in node v22 due to outdated `esm` package [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #43340: [JS] Build fails in node v22 due to outdated 
`esm` package
URL: https://github.com/apache/arrow/issues/43340


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] Build fails in node v22 due to outdated `esm` package [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #43340: [JS] Build fails in node v22 due to outdated 
`esm` package
URL: https://github.com/apache/arrow/issues/43340


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] `arrow2csv` bin entry has wrong extension [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #43341: [JS] `arrow2csv` bin entry has wrong extension
URL: https://github.com/apache/arrow/issues/43341


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] `arrow2csv` bin entry has wrong extension [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #43341: [JS] `arrow2csv` bin entry has wrong extension
URL: https://github.com/apache/arrow/issues/43341


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [JS] When I install any apache-arrow version beyond 14 it throws error [arrow]

2024-07-26 Thread via GitHub


trxcllnt closed issue #41649: [JS] When I install any apache-arrow version 
beyond 14 it throws error
URL: https://github.com/apache/arrow/issues/41649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [CI] Building C++ libraries on Ubuntu aarch64 job takes 3+ hours to complete in java-jars [arrow]

2024-07-26 Thread via GitHub


danepitkin opened a new issue, #43434:
URL: https://github.com/apache/arrow/issues/43434

   ### Describe the enhancement requested
   
   Can we speed this up? Other OS/architectures take 10-30min.
   
   See java-jars job in crossbow: 
https://github.com/ursacomputing/crossbow/actions/runs/9817940199/job/27109855273
   
   ### Component(s)
   
   Continuous Integration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Failure to read valid file [arrow-julia]

2024-07-26 Thread via GitHub


adienes opened a new issue, #511:
URL: https://github.com/apache/arrow-julia/issues/511

   both `pyarrow` and `polars` can read this table, but
   
[mwe.arrow.zip](https://github.com/user-attachments/files/16395769/mwe.arrow.zip)
   
   
   ```
   julia> Arrow.Table("mwe.arrow")
   
   1-element ExceptionStack:
   TaskFailedException
   
   nested task error: MethodError: no method matching init(::Nothing, 
::Vector{UInt8}, ::Int64)
   
   Closest candidates are:
 init(::Type{T}, ::Vector{UInt8}, ::Integer) where 
T<:Union{Arrow.FlatBuffers.Struct, Arrow.FlatBuffers.Table}
  @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/FlatBuffers/table.jl:43
   
   Stacktrace:
[1] getproperty(x::Arrow.Flatbuf.Field, field::Symbol)
  @ Arrow.Flatbuf 
~/.julia/packages/Arrow/5pHqZ/src/metadata/Schema.jl:542
[2] build(field::Arrow.Flatbuf.Field, batch::Arrow.Batch, 
rb::Arrow.Flatbuf.RecordBatch, de::Dict{Int64, Arrow.DictEncoding}, 
nodeidx::Int64, bufferidx::Int64, convert::Bool)
  @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/table.jl:668
[3] iterate(x::Arrow.VectorIterator, ::Tuple{Int64, Int64, Int64})
  @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/table.jl:629
[4] copyto!(dest::Vector{Any}, src::Arrow.VectorIterator)
  @ Base ./abstractarray.jl:948
[5] _collect
  @ ./array.jl:765 [inlined]
[6] collect
  @ ./array.jl:759 [inlined]
[7] macro expansion
  @ ~/.julia/packages/Arrow/5pHqZ/src/table.jl:526 [inlined]
[8] (::Arrow.var"#102#108"{Bool, Channel{Any}, 
ConcurrentUtilities.OrderedSynchronizer, Dict{Int64, Arrow.DictEncoding}, 
Arrow.Batch, Int64})()
  @ Arrow 
~/.julia/packages/ConcurrentUtilities/QOkoO/src/ConcurrentUtilities.jl:48
   Stacktrace:
[1] sync_end(c::Channel{Any})
  @ Base ./task.jl:448
[2] macro expansion
  @ ./task.jl:480 [inlined]
[3] Arrow.Table(blobs::Vector{Arrow.ArrowBlob}; convert::Bool)
  @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/table.jl:441
[4] Table
  @ ~/.julia/packages/Arrow/5pHqZ/src/table.jl:415 [inlined]
[5] Table
  @ ~/.julia/packages/Arrow/5pHqZ/src/table.jl:407 [inlined]
[6] Arrow.Table(input::String)
  @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/table.jl:407
[7] top-level scope
  @ REPL[4]:1
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Python] `pa.Table.from_pylist` support list of tuples? [arrow]

2024-07-26 Thread via GitHub


alanhdu opened a new issue, #43435:
URL: https://github.com/apache/arrow/issues/43435

   ### Describe the enhancement requested
   
   I have a function that returns an iterator-of-tuples and would like to turn 
that into pyarrow table. I have the column names separately, would like to use 
the PyArrow's type inference for the actual types.
   
   I can sort of get what I want with something like:
   
   ```python
   import pandas as pd
   
   pa.Table.from_pandas(
   pd.DataFrame.from_records(tuples, columns=column_names)
   )
   ```
   
   But this doesn't quite work, since Pandas will cast nullable integers to 
floats.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java] Do not publish protobuf based on filename [arrow]

2024-07-26 Thread via GitHub


danepitkin opened a new issue, #43437:
URL: https://github.com/apache/arrow/issues/43437

   ### Describe the enhancement requested
   
   After migrating to Java 11, we see this warning:
   
   ```
   [WARNING] * Required filename-based automodules detected: 
[protobuf-java-3.25.1.jar, protobuf-java-util-3.25.1.jar]. Please don't publish 
this project to a public artifact repository! *
   ```
   
   We should upgrade to a version of protobuf that includes the 
`Automatic-Module-Name`, which prevents downstream projects from experiencing 
dependency version conflicts.
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][FlightRPC] Flight UCX build is failing [arrow]

2024-07-26 Thread via GitHub


felipecrv closed issue #43429: [C++][FlightRPC] Flight UCX build is failing
URL: https://github.com/apache/arrow/issues/43429


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++] vendored abseil fails to build with gcc-13 [arrow]

2024-07-26 Thread via GitHub


assignUser closed issue #43228: [C++] vendored abseil fails to build with gcc-13
URL: https://github.com/apache/arrow/issues/43228


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Unable to filter a factor column in a Dataset using `%in%` [arrow]

2024-07-26 Thread via GitHub


spencerpease opened a new issue, #43440:
URL: https://github.com/apache/arrow/issues/43440

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hello, 
   
   Is it possible filter a factor using `%in%` in an Arrow Dataset?
   
   I naively expected that if I save an arrow IPC file with a factor column, I 
would then be able to filter that column using `%in%` when the file is loaded 
as an Arrow Dataset. Instead, I get the error `Type error: Array type doesn't 
match type of values set: string vs dictionary`. Arrow seems aware of factors though, since I can filter that same 
column using `==` or `!=` and collecting the dataset without filtering returns 
a factor.  I was able to recreate this error on both Windows and Linux, please 
see the attached reprex for details. 
   
Thank you in advance for your help!
   
   ``` r
   # Create a simple data.frame and save as an arrow IPC file
   temp_file <- tempfile()
   d1 <- data.frame(x = factor(c("a", "b", "c")))
   arrow::write_feather(d1, temp_file)
   
   # Filtering using == (or !=) works
   d2 <- arrow::open_dataset(temp_file, format = "arrow") |>
 dplyr::filter(x == "a") |>
 dplyr::collect()
   
   # Filtering using %in% does not work (for single or multiple values)
   d3 <- arrow::open_dataset(temp_file, format = "arrow") |>
 dplyr::filter(x %in% "a") |>
 dplyr::collect()
   #> Error in `compute.arrow_dplyr_query()`:
   #> ! Type error: Array type doesn't match type of values set: string vs 
dictionary
   
   # Collecting the dataset before filtering also works and returns a factor
   d4 <- arrow::open_dataset(temp_file, format = "arrow") |>
 dplyr::collect() |>
 dplyr::filter(x %in% c("a"))
   
   is.factor(d4$x)
   #> [1] TRUE
   ```
   
   Created on 2024-07-26 with [reprex 
v2.1.1](https://reprex.tidyverse.org)
   
   
   
   
   
   Session info
   
   
   ``` r
   sessionInfo()
   #> R version 4.4.1 (2024-06-14 ucrt)
   #> Platform: x86_64-w64-mingw32/x64
   #> Running under: Windows 11 x64 (build 22631)
   #> 
   #> Matrix products: default
   #> 
   #> 
   #> locale:
   #> [1] LC_COLLATE=English_United States.utf8 
   #> [2] LC_CTYPE=English_United States.utf8   
   #> [3] LC_MONETARY=English_United States.utf8
   #> [4] LC_NUMERIC=C  
   #> [5] LC_TIME=English_United States.utf8
   #> 
   #> time zone: America/Los_Angeles
   #> tzcode source: internal
   #> 
   #> attached base packages:
   #> [1] stats graphics  grDevices utils datasets  methods   base 
   #> 
   #> loaded via a namespace (and not attached):
   #>  [1] vctrs_0.6.5   cli_3.6.3 knitr_1.48rlang_1.1.4
  
   #>  [5] xfun_0.45 purrr_1.0.2   generics_0.1.3
assertthat_0.2.1 
   #>  [9] glue_1.7.0bit_4.0.5 htmltools_0.5.8.1 fansi_1.0.6
  
   #> [13] rmarkdown_2.27tibble_3.2.1  evaluate_0.24.0   tzdb_0.4.0 
  
   #> [17] fastmap_1.2.0 yaml_2.3.9lifecycle_1.0.4   compiler_4.4.1 
  
   #> [21] dplyr_1.1.4   fs_1.6.4  pkgconfig_2.0.3   
rstudioapi_0.16.0
   #> [25] digest_0.6.36 R6_2.5.1  utf8_1.2.4reprex_2.1.1   
  
   #> [29] tidyselect_1.2.1  pillar_1.9.0  magrittr_2.0.3tools_4.4.1
  
   #> [33] withr_3.0.0   bit64_4.0.5   arrow_16.1.0
   ```
   
   
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++] concatenate nested namespace in c++17 style with clang-tidy [arrow]

2024-07-26 Thread via GitHub


IndifferentArea closed issue #43421: [C++] concatenate nested namespace in 
c++17 style with clang-tidy
URL: https://github.com/apache/arrow/issues/43421


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Mixing RLE_DICTIONARY and other column encodings in pyarrow parquet [arrow]

2024-07-26 Thread via GitHub


bkief opened a new issue, #43442:
URL: https://github.com/apache/arrow/issues/43442

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The ValueError at 
https://github.com/apache/arrow/blob/aaeff72dd9cb4658913fde3d176416be9a93ebe0/python/pyarrow/_parquet.pyx#L1360-L1367
 will raise anytime any column is custom encoded with a dictionary method. This 
makes it impossible to mix a dictionary encoded column with something like 
`DELTA_BINARY_PACKED`. I understand this is to prevent duplication of 
`use_dictionary`. Would it be okay to move this ValueError to the calling 
function instead? Does anything at the C++ level prevent this? 
https://github.com/apache/arrow/blob/aaeff72dd9cb4658913fde3d176416be9a93ebe0/python/pyarrow/_parquet.pyx#L1971-L1972
 
   
   
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] The flight.NewRecordWriter parameter is ambiguous [arrow]

2024-07-26 Thread via GitHub


mac-zhenfang opened a new issue, #43443:
URL: https://github.com/apache/arrow/issues/43443

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   The func NewRecordWriter(w DataStreamWriter, opts ...ipc.Option) *Writer  
API indicates the DataStreamWriter is a required parameter, all the others are 
optional. But in the 
   
   func (w *Writer) start() error {
w.started = true
   
w.mapper.ImportSchema(w.schema)
w.lastWrittenDicts = make(map[int64]arrow.Array)
   
// write out schema payloads
ps := payloadFromSchema(w.schema, w.mem, &w.mapper)
defer ps.Release()
   
for _, data := range ps {
err := w.pw.WritePayload(data)
if err != nil {
return err
}
}
   
return nil
   }
   
   The w.schema looks a required parameter. If it s nil, will report arrow/ipc: 
unknown error while writing: runtime error: invalid memory address or nil 
pointer dereference error.
   
   The request is to use the Schema in RecordBatch, instead of a input option
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++] Benchmark Arrow BinaryViewBuilder [arrow]

2024-07-26 Thread via GitHub


mapleFU opened a new issue, #43444:
URL: https://github.com/apache/arrow/issues/43444

   ### Describe the enhancement requested
   
   BinaryViewBuilder need benchmark
   
   ### Component(s)
   
   Benchmarking, C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Partitioned variable does not read in as the correct type [arrow]

2024-07-27 Thread via GitHub


thisisnic closed issue #43303: Partitioned variable does not read in as the 
correct type
URL: https://github.com/apache/arrow/issues/43303


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python] Could not read encrypted metadata via pq.read_table [arrow]

2024-07-27 Thread via GitHub


heyuqi1970 closed issue #43406: [Python] Could not read encrypted metadata via 
pq.read_table
URL: https://github.com/apache/arrow/issues/43406


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [R] String columns read lazily from readr error when transferred to an arrow table [arrow]

2024-07-27 Thread via GitHub


jonkeane closed issue #43349: [R] String columns read lazily from readr error 
when transferred to an arrow table
URL: https://github.com/apache/arrow/issues/43349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][grpc] 0-length buffers sent to grpc can cause indefinite hangs on MacOS/iOS [arrow]

2024-07-28 Thread via GitHub


ziglerari opened a new issue, #43447:
URL: https://github.com/apache/arrow/issues/43447

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   A [gRPC issue](https://github.com/grpc/grpc/pull/37255) has been identified 
where transmitting buffers of zero length leads to a persistent hang on 
MacOS/iOS platforms.
   
   Such zero-length buffers may arise, for example, in the context of using the 
Array structure alongside a validity bitmap, as outlined in the [Arrow 
Spec](https://arrow.apache.org/docs/format/Columnar.html#validity-bitmaps).
   
   Essentially, when every element within an Array is valid (i.e., not null), 
it's possible to represent this state with a null validity bitmap, indicating 
that all elements are valid. This scenario is realized through the use of a 
null buffer, as demonstrated here:
   
https://github.com/apache/arrow/blob/187197c369058f7d1377c1b161c469a9e4542caf/cpp/src/arrow/ipc/writer.cc#L165-L179
   
   
   The relevant sections of code from both transport mechanisms are provided 
below for reference:
   
https://github.com/apache/arrow/blob/187197c369058f7d1377c1b161c469a9e4542caf/cpp/src/arrow/flight/transport/grpc/serialization_internal.cc#L283-L287
   
   
https://github.com/apache/arrow/blob/187197c369058f7d1377c1b161c469a9e4542caf/cpp/src/arrow/flight/transport/ucx/ucx_internal.cc#L590-L591
   
   Upon examining the UCX transport's approach, it's evident that a precaution 
is taken to avoid sending zero-length buffers. This strategy appears prudent, 
as it eliminates the need to forward non-transmittable buffers to the transport 
layer, potentially offering a solution to the issue observed with gRPC.
   
   ### Component(s)
   
   C++, FlightRPC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [CI] Conan-* crosbow jobs need credentials to docker hub (or to not attempt to push there [arrow]

2024-07-28 Thread via GitHub


jonkeane opened a new issue, #43449:
URL: https://github.com/apache/arrow/issues/43449

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Both `conan-minimal` and `conan-maximal` have been failing for a long time 
(minimal for 116 days, the maximal one has no successful runs that are reported 
on the [crossbow report](http://crossbow.voltrondata.com).
   
   They both seem to fail with:
   
   ```
   The push refers to repository [docker.io/conanio/gcc10]
   3b2ed178cc9f: Preparing
   5f70bf18a086: Preparing
   a7c9350b994b: Preparing
   28da0445c449: Preparing
   28da0445c449: Layer already exists
   a7c9350b994b: Layer already exists
   3b2ed178cc9f: Layer already exists
   5f70bf18a086: Layer already exists
   errors:
   denied: requested access to the resource is denied
   unauthorized: authentication required
   
   Traceback (most recent call last):
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/docker/core.py", 
line 224, in _execute_docker
   result = Docker().run(*args, **kwargs)
^
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/utils/command.py",
 line 78, in run
   return subprocess.run(invocation, **kwargs)
  
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/subprocess.py", line 
571, in run
   raise CalledProcessError(retcode, process.args,
   subprocess.CalledProcessError: Command '['docker', 'push', 
'conanio/gcc10:1.62.0']' returned non-zero exit status 1.
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
 File "/opt/hostedtoolcache/Python/3.12.4/x64/bin/archery", line 8, in 

   sys.exit(archery())
^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 1157, in __call__
   return self.main(*args, **kwargs)
  ^^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 1078, in main
   rv = self.invoke(ctx)

 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 1688, in invoke
   return _process_result(sub_ctx.command.invoke(sub_ctx))
  ^^^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 1688, in invoke
   return _process_result(sub_ctx.command.invoke(sub_ctx))
  ^^^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 1434, in invoke
   return ctx.invoke(self.callback, **ctx.params)
  ^^^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/core.py",
 line 783, in invoke
   return __callback(*args, **kwargs)
  ^^^
 File 
"/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/click/decorators.py",
 line 45, in new_func
   return f(get_current_context().obj, *args, **kwargs)
  ^
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/docker/cli.py", 
line 259, in docker_compose_push
   compose.push(image, user=user, ***
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/docker/core.py", 
line 447, in push
   _push(service)
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/docker/core.py", 
line 430, in _push
   return self._execute_docker('push', service['image'])
  ^^
 File 
"/home/runner/work/crossbow/crossbow/arrow/dev/archery/archery/docker/core.py", 
line 227, in _execute_docker
   raise RuntimeError(
   RuntimeError: docker push conanio/gcc10:1.62.0 exited with non-zero exit 
code 1
   ```
   
   [recent 
log](https://github.com/ursacomputing/crossbow/actions/runs/10122212172/job/27994040083).
 
   
   @kou It looks like you were working on those most recently β€” do you have the 
credentials we need? Or know if we can disable the uploading? Or possibly 
remove those tasks from Crossbow?
   
   ### Component(s)
   
   Continuous Integration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [R] RStudio crash [arrow]

2024-07-28 Thread via GitHub


jonkeane closed issue #43241: [R] RStudio crash
URL: https://github.com/apache/arrow/issues/43241


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Format] Add Opaque canonical extension type [arrow]

2024-07-28 Thread via GitHub


lidavidm opened a new issue, #43453:
URL: https://github.com/apache/arrow/issues/43453

   ### Describe the enhancement requested
   
   As proposed on the mailing list: 
https://lists.apache.org/thread/4pykofrzvkl7dwsnzys8rwnq2owfnt43
   
   ### Component(s)
   
   Format


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Python] Add Opaque canonical extension type [arrow]

2024-07-28 Thread via GitHub


lidavidm opened a new issue, #43454:
URL: https://github.com/apache/arrow/issues/43454

   ### Describe the enhancement requested
   
   Part of #43453.
   
   ### Component(s)
   
   C++, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Go] Add Opaque canonical extension type [arrow]

2024-07-28 Thread via GitHub


lidavidm opened a new issue, #43455:
URL: https://github.com/apache/arrow/issues/43455

   ### Describe the enhancement requested
   
   Part of https://github.com/apache/arrow/issues/43453
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java] Add Opaque canonical extension type [arrow]

2024-07-28 Thread via GitHub


lidavidm opened a new issue, #43456:
URL: https://github.com/apache/arrow/issues/43456

   ### Describe the enhancement requested
   
   Part of https://github.com/apache/arrow/issues/43453
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] The Go flightsql driver doesn't handle scanning LargeString or LargeBinary types [arrow]

2024-07-28 Thread via GitHub


phillipleblanc closed issue #43403: The Go flightsql driver doesn't handle 
scanning LargeString or LargeBinary types
URL: https://github.com/apache/arrow/issues/43403


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] Do not publish protobuf based on filename [arrow]

2024-07-28 Thread via GitHub


lidavidm closed issue #43437: [Java] Do not publish protobuf based on filename
URL: https://github.com/apache/arrow/issues/43437


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java][Packaging] java-jars failing on maven module [arrow]

2024-07-28 Thread via GitHub


lidavidm closed issue #43432: [Java][Packaging] java-jars failing on maven 
module
URL: https://github.com/apache/arrow/issues/43432


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] Upgrade JNI version [arrow]

2024-07-28 Thread via GitHub


lidavidm closed issue #43425: [Java] Upgrade JNI version
URL: https://github.com/apache/arrow/issues/43425


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][grpc] Sending 0-length buffers to gRPC can result in indefinite hangs on MacOS/iOS platforms [arrow]

2024-07-28 Thread via GitHub


lidavidm closed issue #43447: [C++][grpc] Sending 0-length buffers to gRPC can 
result in indefinite hangs on MacOS/iOS platforms
URL: https://github.com/apache/arrow/issues/43447


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Gandiva] Attribute mismatch error with unity build [arrow]

2024-07-28 Thread via GitHub


kou opened a new issue, #43463:
URL: https://github.com/apache/arrow/issues/43463

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   ```diff
   [678/708] Building CXX object 
src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx.o
   FAILED: 
src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx.o
 
   /opt/homebrew/bin/ccache /Library/Developer/CommandLineTools/usr/bin/c++ 
-DARROW_EXTRA_ERROR_CONTEXT -DARROW_HAVE_NEON -DARROW_STATIC 
-DARROW_WITH_TIMING_TESTS -DGANDIVA_STATIC -DGANDIVA_UNIT_TEST=1 
-I/Users/kou/work/cpp/arrow/cpp.build/src -I/Users/kou/work/cpp/arrow/cpp/src 
-I/Users/kou/work/cpp/arrow/cpp/src/generated -isystem 
/Users/kou/work/cpp/arrow/cpp/thirdparty/flatbuffers/include -isystem 
/Users/kou/work/cpp/arrow/cpp.build/_deps/googletest-src/googletest/include 
-isystem /Users/kou/work/cpp/arrow/cpp.build/_deps/googletest-src/googletest 
-isystem 
/Users/kou/work/cpp/arrow/cpp.build/_deps/googletest-src/googlemock/include 
-isystem /Users/kou/work/cpp/arrow/cpp.build/_deps/googletest-src/googlemock 
-isystem /opt/homebrew/include -fno-aligned-new  -Qunused-arguments 
-fcolor-diagnostics  -Wall -Wextra -Wdocumentation -DARROW_WARN_DOCUMENTATION 
-Wshorten-64-to-32 -Wno-missing-braces -Wno-unused-parameter 
-Wno-constant-logical-operand -Wno-return-stack-address -Wdate-time -Wn
 o-unknown-warning-option -Wno-pass-failed -march=armv8-a  -g -Werror -O0 -ggdb 
 -std=c++17 -arch arm64 -isysroot 
/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk -fPIE -MD -MT 
src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx.o
 -MF 
src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx.o.d
 -o 
src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx.o
 -c 
/Users/kou/work/cpp/arrow/cpp.build/src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx
   In file included from 
/Users/kou/work/cpp/arrow/cpp.build/src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx:7:
   In file included from 
/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/bitmap_test.cc:19:
   In file included from 
/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/types.h:22:
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/gdv_function_stubs.h:77:1: error: 
attribute declaration must precede definition [-Werror,-Wignored-attributes]
   GANDIVA_EXPORT
   ^
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/visibility.h:39:39: note: expanded 
from macro 'GANDIVA_EXPORT'
   #define GANDIVA_EXPORT __attribute__((visibility("default")))
 ^
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/context_helper.cc:63:6: note: 
previous definition is here
   void gdv_fn_context_set_error_msg(int64_t context_ptr, char const* err_msg) {
^
   In file included from 
/Users/kou/work/cpp/arrow/cpp.build/src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx:7:
   In file included from 
/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/bitmap_test.cc:19:
   In file included from 
/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/types.h:22:
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/gdv_function_stubs.h:80:1: error: 
attribute declaration must precede definition [-Werror,-Wignored-attributes]
   GANDIVA_EXPORT
   ^
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/visibility.h:39:39: note: expanded 
from macro 'GANDIVA_EXPORT'
   #define GANDIVA_EXPORT __attribute__((visibility("default")))
 ^
   /Users/kou/work/cpp/arrow/cpp/src/gandiva/context_helper.cc:68:10: note: 
previous definition is here
   uint8_t* gdv_fn_context_arena_malloc(int64_t context_ptr, int32_t size) {
^
   2 errors generated.
   ```
   
   
`src/gandiva/precompiled/CMakeFiles/gandiva-precompiled-test.dir/Unity/unity_0_cxx.cxx`:
   
   ```cpp
   /* generated by CMake */
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include "/Users/kou/work/cpp/arrow/cpp/src/gandiva/context_helper.cc"
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include 
"/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/bitmap_test.cc"
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include "/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/bitmap.cc"
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include 
"/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/epoch_time_point_test.cc"
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include "/Users/kou/work/cpp/arrow/cpp/src/gandiva/precompiled/time_test.cc"
   
   /* NOLINTNEXTLINE(bugprone-suspicious-include,misc-include-cleaner) */
   #include "/Users/kou/work/cpp/arrow/cpp/src/gandiva/pr

[I] [Packaging][C++] Fail to build bundled ORC with the official LZ4 CMake package on Debian GNU/Linux trixie [arrow]

2024-07-29 Thread via GitHub


kou opened a new issue, #43467:
URL: https://github.com/apache/arrow/issues/43467

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Recently, Debian GNU/Linux trixie provides LZ4 CMake package based on the 
official CMake build system.
   
   Our LZ4 detection code doesn't work with it:
   
   
https://github.com/ursacomputing/crossbow/actions/runs/10130290903/job/28011414872#step:8:3691
   
   ```text
   -- Building Apache ORC from source
   CMake Error at cmake_modules/ThirdpartyToolchain.cmake:4512 
(get_target_property):
 get_target_property() called with non-existent target "LZ4::lz4".
   Call Stack (most recent call first):
 cmake_modules/ThirdpartyToolchain.cmake:208 (build_orc)
 cmake_modules/ThirdpartyToolchain.cmake:304 (build_dependency)
 cmake_modules/ThirdpartyToolchain.cmake:4698 (resolve_dependency)
 CMakeLists.txt:544 (include)
   ```
   
   Because `LZ4::lz4` isn't provided with LZ4 1.9.4. 
   
   ### Component(s)
   
   C++, Packaging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [CI]: Temporarily turn off conda jobs that are failing [arrow]

2024-07-29 Thread via GitHub


raulcd closed issue #43450: [CI]: Temporarily turn off conda jobs that are 
failing
URL: https://github.com/apache/arrow/issues/43450


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Change the default CompressionCodec.Factory to leverage compression support transparently [arrow]

2024-07-29 Thread via GitHub


ccciudatu opened a new issue, #43469:
URL: https://github.com/apache/arrow/issues/43469

   ### Describe the enhancement requested
   
   Application code is currently required to choose upfront between handling 
compressed vs. uncompressed data by specifying one of the two (mutually 
exclusive) `CompressionCodec.Factory` implementations: 
`NoCompressionCodec.Factory` and `CommonsCompressionCodecFactory`.
   
   While this is totally acceptable (or even required) for the write path (e.g. 
`ArrowWriter`) it makes it really tedious to support compression on the read 
path, as it's not reasonable to choose between handling 
_uncompressed-data-only_ and _compressed-data-only_ when writing (e.g.) a 
client app for Arrow Flight.
   As already reported in https://github.com/apache/arrow/issues/41457, the 
Java FlightClient currently fails with the following error when trying to 
decode a compressed stream:
   
   ```
   java.lang.IllegalArgumentException: Please add arrow-compression module to 
use CommonsCompressionCodecFactory for LZ4_FRAME
at 
org.apache.arrow.vector.compression.NoCompressionCodec$Factory.createCodec(NoCompressionCodec.java:63)
at 
org.apache.arrow.vector.compression.CompressionCodec$Factory$1.createCodec(CompressionCodec.java:91)
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:79)
at org.apache.arrow.flight.FlightStream.next(FlightStream.java:275)
   ```
   The `FlightStream` class does not explicitly pass a compression codec 
factory when creating a `VectorLoader`, which then uses the default 
`NoCompressionCodec.Factory`. Changing the default to 
`CommonsCompressionCodecFactory` is not an option because:
   
   1. `CommonsCompressionCodecFactory` does not support uncompressed data
   2. `arrow-compression` is not a dependency for `arrow-vector`
   
   Instead of challenging these two design decisions, the proposed solution 
(upcoming PR) is to make the default `CompressionCodec.Factory` use a 
`ServiceLoader` to gather all the available implementations and combine them to 
support as many `CodecType`s as possible, falling back to the `NO_COMPRESSION` 
codec type (i.e. the same default as today).
   
   The arrow-compression module would then act as a service provider, so that 
whenever it's present in the module- (or class-) path, it will transparently 
fill in the gaps of the default factory.
   As a side note, this is in fact the literal meaning of the above error 
message (_"Please add arrow-compression module to use 
CommonsCompressionCodecFactory"_), so we can assume this was the original 
intention.
   
   
   ### Component(s)
   
   FlightRPC, Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] go/adbc/driver/flightsql: Default Value (10 MB) For adbc.snowflake.rpc.ingest_target_file_size Not Used In 1.1.0 [arrow-adbc]

2024-07-29 Thread via GitHub


zeroshade closed issue #1997: go/adbc/driver/flightsql: Default Value (10 MB) 
For adbc.snowflake.rpc.ingest_target_file_size Not Used In 1.1.0
URL: https://github.com/apache/arrow-adbc/issues/1997


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Restrict direct access to `sun.misc.Unsafe` [arrow]

2024-07-29 Thread via GitHub


laurentgo opened a new issue, #43479:
URL: https://github.com/apache/arrow/issues/43479

   ### Describe the enhancement requested
   
   `sun.misc.Unsafe` is a Java internal class only accessible to classes loaded 
of the boot classloader, unless one uses reflection to bypass this restriction.
   
   `org.apache.arrow.memory.util.MemoryUtil` makes it available as a public 
field to any java classes which is kind of opening a pandora box. As a first 
step towards switching from `sun.misc.Unsafe` to safer memory access methods 
(which may become a requirement at some point as discussed in [JEP 
471](https://openjdk.org/jeps/471) ), remove direct access to `sun.misc.Unsafe` 
instance.
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [CI] Conan-* crosbow jobs need credentials to docker hub (or to not attempt to push there [arrow]

2024-07-29 Thread via GitHub


assignUser closed issue #43449: [CI] Conan-* crosbow jobs need credentials to 
docker hub (or to not attempt to push there
URL: https://github.com/apache/arrow/issues/43449


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++] Test linkage error when googletest 1.15.0 is installed system wide despite bundling [arrow]

2024-07-29 Thread via GitHub


assignUser closed issue #43400: [C++] Test linkage error when googletest 1.15.0 
is installed system wide despite bundling
URL: https://github.com/apache/arrow/issues/43400


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] Java Dataset API ScanOptions expansion [arrow]

2024-07-29 Thread via GitHub


lidavidm closed issue #28866: [Java] Java Dataset API ScanOptions expansion
URL: https://github.com/apache/arrow/issues/28866


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] c: fix include paths for adbc.h [arrow-adbc]

2024-07-29 Thread via GitHub


lidavidm closed issue #1150: c: fix include paths for adbc.h
URL: https://github.com/apache/arrow-adbc/issues/1150


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [JAVA] [C++] Support more CsvFragmentScanOptions in JNI call [arrow]

2024-07-30 Thread via GitHub


jinchengchenghh opened a new issue, #43483:
URL: https://github.com/apache/arrow/issues/43483

   ### Describe the enhancement requested
   
   Support most of the options mapping to cpp code struct
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python] Expose methods to get the device and memory_manager on the pyarrow.cuda.Context class [arrow]

2024-07-30 Thread via GitHub


jorisvandenbossche closed issue #43391: [Python] Expose methods to get the 
device and memory_manager on the pyarrow.cuda.Context class
URL: https://github.com/apache/arrow/issues/43391


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] SEGFAULT in test_udf_via_substrait when run in CPython debug build [arrow]

2024-07-30 Thread via GitHub


lysnikolaou opened a new issue, #43487:
URL: https://github.com/apache/arrow/issues/43487

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hey everyone! πŸ‘‹ 
   
   I'm trying to build arrow and PyArrow with a debug build of CPython and run 
the test suite, but I keep running into a segmentation fault. The crash happens 
[in `test_udf_via_substrait` 
here](https://github.com/apache/arrow/blob/main/python/pyarrow/tests/test_substrait.py#L440),
 and I can reproduce it using both 3.13.0b4+ and 3.12.4.
   
   The source of the segmentation fault is the `Py_INCREF` that's happening 
[here](https://github.com/apache/arrow/blob/main/python/pyarrow/src/arrow/python/udf.cc#L48).
 Because this is under a debug build, `Py_INCREF` tries to access the thread 
state to increase the aggregate reference count.
   
   
   Stripped stack trace to `Py_INCREF` call
   
   ```
   libpython3.13td.dylib!reftotal_add 
(/Users/user/.pyenv/sources/3.13t-dev-debug/Python-3.13-dev/Objects/object.c:84)
   libpython3.13td.dylib!_Py_INCREF_IncRefTotal 
(/Users/user/.pyenv/sources/3.13t-dev-debug/Python-3.13-dev/Objects/object.c:231)
   libarrow_python.dylib!Py_INCREF(_object*) 
(/Users/user/.pyenv/versions/3.13t-dev-debug/include/python3.13td/object.h:835)
   libarrow_python.dylib!arrow::py::(anonymous 
namespace)::PythonUdfKernelState::PythonUdfKernelState(std::__1::shared_ptr)
 (/Users/user/repos/python/arrow/python/pyarrow/src/arrow/python/udf.cc:48)
   libarrow_python.dylib!arrow::py::(anonymous 
namespace)::PythonUdfKernelState::PythonUdfKernelState(std::__1::shared_ptr)
 (/Users/user/repos/python/arrow/python/pyarrow/src/arrow/python/udf.cc:47)
   libarrow_python.dylib!std::__1::__unique_if::__unique_single 
std::__1::make_unique[abi:ue170006]&>(std::__1::shared_ptr&)
 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__memory/unique_ptr.h:689)
   libarrow_python.dylib!arrow::py::(anonymous 
namespace)::PythonUdfKernelInit::operator()(arrow::compute::KernelContext*, 
arrow::compute::KernelInitArgs const&) 
(/Users/user/repos/python/arrow/python/pyarrow/src/arrow/python/udf.cc:78)
   libarrow_python.dylib!decltype(std::declval()(std::declval(),
 std::declval())) 
std::__1::__invoke[abi:ue170006](arrow::py::(anonymous 
namespace)::PythonUdfKernelInit&, arrow::compute::KernelContext*&&, 
arrow::compute::KernelInitArgs const&) 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__type_traits/invoke.h:340)
   
libarrow_python.dylib!arrow::Result>> 
std::__1::__invoke_void_return_wrapper>>, 
false>::__call[abi:ue170006](arrow::py::(anonymous 
namespace)::PythonUdfKernelInit&, arrow::compute::KernelContext*&&, 
arrow::compute::KernelInitArgs const&) 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__type_traits/invoke.h:407)
   
libarrow_python.dylib!std::__1::__function::__alloc_func, 
arrow::Result>> 
(arrow::compute::KernelContext*, arrow::compute::KernelInitArgs 
const&)>::operator()[abi:ue170006](arrow::compute::KernelContext*&&, 
arrow::compute::KernelInitArgs const&) 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:193)
   libarrow_python.dylib!std::__1::__function::__func, 
arrow::Result>> 
(arrow::compute::KernelContext*, arrow::compute::KernelInitArgs 
const&)>::operator()(arrow::compute::KernelContext*&&, 
arrow::compute::KernelInitArgs const&) 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:364)
   
libarrow.1800.0.0.dylib!std::__1::__function::__value_func>> 
(arrow::compute::KernelContext*, arrow::compute::KernelInitArgs 
const&)>::operator()[abi:ue170006](arrow::compute::KernelContext*&&, 
arrow::compute::KernelInitArgs const&) const 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:518)
   
libarrow.1800.0.0.dylib!std::__1::function>> 
(arrow::compute::KernelContext*, arrow::compute::KernelInitArgs 
const&)>::operator()(arrow::compute::KernelContext*, 
arrow::compute::KernelInitArgs const&) const 
(/Library/Developer/CommandLineTools/SDKs/MacOSX14.4.sdk/usr/include/c++/v1/__functional/function.h:1169)
   libarrow.1800.0.0.dylib!arrow::compute::(anonymous 
namespace)::BindNonRecursive(arrow::compute::Expression::Call, bool, 
arrow::compute::ExecContext*)::$_21::operator()() const 
(/Users/user/repos/python/arrow/cpp/src/arrow/compute/expression.cc:544)
   libarrow.1800.0.0.dylib!arrow::compute::(anonymous 
namespace)::BindNonRecursive(arrow::compute::Expression::Call, bool, 
arrow::compute::ExecContext*) 
(/Users/user/repos/python/arrow/cpp/src/arrow/compute/expression.cc:560)
   libarrow.1800.0.0.dylib!arrow::Result 
arrow::compute::(anonymous 
namespace)::BindImpl(arrow::compute::Expression, arrow::Schema 
const&, arrow::compute::ExecContext*) 
(/Users/user/repos/python/arrow/cpp/src/arrow/compute/expression.cc:628)
   libarrow.1800.0

Re: [I] Snowflake adbc_ingest reverting back to CSV uploading [arrow-adbc]

2024-07-30 Thread via GitHub


davlee1972 closed issue #2041: Snowflake adbc_ingest reverting back to CSV 
uploading
URL: https://github.com/apache/arrow-adbc/issues/2041


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Performance questions: What is the best way to upsert and in general? (Postgres) [arrow-adbc]

2024-07-30 Thread via GitHub


avm19 opened a new issue, #2046:
URL: https://github.com/apache/arrow-adbc/issues/2046

   ### What would you like help with?
   
   - Why is `executemany()` much slower than `adbc_ingest()`?
   - What is the best way and most performant way to pass data with a complex 
query/operation?
   - Is there anything I am doing wrong?
   
   --
   
   I want to insert and update records in a table using Python API of 
adbc_driver_postgres, let's say, I have 10k rows:
   ```python
   import pyarrow as pa
   a = pa.array(range(1))
   table =  pa.Table.from_arrays(arrays=[a,a,a], names=['col1', 'col2', 'col3'])
   ```
   
   I noticed that `executemany()` is much slower than `adbc_ingest()` for 
ingesting data. Let's say I have 10k rows:
   
   ```python
   # 0.1 s
   with adbc_driver_postgresql.dbapi.connect(uri) as conn:
   with conn.cursor() as cursor:
   cursor.execute("TRUNCATE TABLE test_table;")
   cursor.adbc_ingest('test_table', table, mode="replace")
   cursor.execute('ALTER TABLE test_table ADD PRIMARY KEY ("col1");')
   conn.commit()
   ```
   
   as compared to 
   
   ```python
   # ~7.5 sec
   with adbc_driver_postgresql.dbapi.connect(uri) as conn:
   with conn.cursor() as cursor:
   cursor.execute("TRUNCATE TABLE test_table;")
   query = 'INSERT INTO test_table ("col1", "col2", "col3") VALUES ($1, 
$2, $3);'
   cursor.executemany(query, table)   
   conn.commit()
   ```
   
   I don't mind using `adbc_ingest()` to populate my database, but later in its 
lifecycle I need to upsert records and more.  For example, I need to do 
something like:
   
   ```python
   # ~7.5 sec
   query = (
   'INSERT INTO test_table ("col1", "col2", "col3") VALUES ($1, $2, $3)'
   'ON CONFLICT ("col1") DO UPDATE SET "col2" = EXCLUDED."col2", "col3" = 
0;'
   )
   with adbc_driver_postgresql.dbapi.connect(uri) as conn:
   with conn.cursor() as cursor:
cursor.executemany(query, table)
   ```
   
   which is too slow. Apparently `executemany()` is extremely inefficient for 
this ask. What is the cause of such a poor performance? What is the bottleneck?
   
   The same outcome could be achieved much faster by first ingesting data into 
a temporary table and then making Postgres run a more complex operation from it 
rather than from input:
   
   ```python
   # 0.2s
   with adbc_driver_postgresql.dbapi.connect(uri) as conn:
   with conn.cursor() as cursor:
   cursor.adbc_ingest('test_table2', table, mode="replace")
   query = (
   'INSERT INTO test_table ("col1", "col2", "col3")\n'
   'SELECT "col1", "col2", "col3" FROM test_table2\n'
   'ON CONFLICT ("col1") DO UPDATE SET "col2" = EXCLUDED."col2", 
"col3" = 0;'
   )
   cursor.execute(query)
   conn.commit()
   ```
   
   This approach gives a reasonable performance, but is this how one supposed 
to do this? Is there anything that can be easily improved? I do not know much 
about Postgres's backend operation and what optimisations it does for 
ingestion, but I suspect that it is not best practice to create temporary 
tables (which are not even TEMPORARY) when we just want to stream data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Restrict direct access to `sun.misc.Unsafe` [arrow]

2024-07-30 Thread via GitHub


danepitkin closed issue #43479: Restrict direct access to `sun.misc.Unsafe`
URL: https://github.com/apache/arrow/issues/43479


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Change the default CompressionCodec.Factory to leverage compression support transparently [arrow]

2024-07-30 Thread via GitHub


danepitkin closed issue #43469: Change the default CompressionCodec.Factory to 
leverage compression support transparently
URL: https://github.com/apache/arrow/issues/43469


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java] Add example using CompressionCodec [arrow-cookbook]

2024-07-30 Thread via GitHub


danepitkin opened a new issue, #354:
URL: https://github.com/apache/arrow-cookbook/issues/354

   The CompressionCodec now uses a ServiceLoader to load all available options. 
The default is NoCompressionCodec. See 
https://github.com/apache/arrow/pull/43471


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Packaging][C++] Fail to build bundled ORC with the official LZ4 CMake package on Debian GNU/Linux trixie [arrow]

2024-07-30 Thread via GitHub


assignUser closed issue #43467: [Packaging][C++] Fail to build bundled ORC with 
the official LZ4 CMake package on Debian GNU/Linux trixie
URL: https://github.com/apache/arrow/issues/43467


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Java]When i use DatasetFileWriter::write method write a file, I can't specify a file name [arrow]

2024-07-30 Thread via GitHub


shouriken opened a new issue, #43489:
URL: https://github.com/apache/arrow/issues/43489

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   I use this static method to write a parquet file into fs, I give empty 
partition array, so it will be in a file; and i give the `baseNameTemplate` arg 
is "test.parquet" to specify the filename, but it leads to an error: 
`basename_template did not contain '{i}'`.
   ```java
   public static void write(BufferAllocator allocator, ArrowReader reader, 
FileFormat format, String uri,
  String[] partitionColumns, int maxPartitions, 
String baseNameTemplate) {
   try (final ArrowArrayStream stream = 
ArrowArrayStream.allocateNew(allocator)) {
 Data.exportArrayStream(allocator, reader, stream);
 JniWrapper.get().writeFromScannerToFile(stream.memoryAddress(),
 format.id(), uri, partitionColumns, maxPartitions, 
baseNameTemplate);
   }
 }
   ```
   read the cpp jni code and cpp dataset's code, the 
FileSystemDatasetWriteOptions::basename_template seems not supported to specify 
the file name without `{i}`.
   ```cpp
   /// \brief Options for writing a dataset.
   struct ARROW_DS_EXPORT FileSystemDatasetWriteOptions {
   .
   .
   .
 /// Template string used to generate fragment basenames.
 /// {i} will be replaced by an auto incremented integer.
 std::string basename_template;
   .
   .
   .
   }
   ```
   If I have any method to specify the filename when none partition?
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Use Amazon KMS for encryption having error: "OSError: Incorrect key to columns mapping in column keys property:" [arrow]

2024-07-31 Thread via GitHub


channingdata closed issue #43426: Use Amazon KMS for encryption having error: 
"OSError: Incorrect key to columns mapping in column keys property:"
URL: https://github.com/apache/arrow/issues/43426


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Java] `new RootAllocator()` occasionally report error `java.lang.NoSuchFieldError: chunkSize` [arrow]

2024-07-31 Thread via GitHub


xinyiZzz opened a new issue, #43491:
URL: https://github.com/apache/arrow/issues/43491

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hi,
   
   `new RootAllocator()` occasionally reports an error, I looked through other 
issues and guess that the versions of `Netty` and `Arrow` are incompatible, 
could this be the reason for the error?
   
   Arrow version: 15.0.2
   Netty version: 4.1.104.Final
   
   I am trying to upgrade Arrow 17.0.0 to see if it can solve the problem.
   
   Thanks.
   
   ```
   java.lang.NoSuchFieldError: chunkSize
   at 
io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.(PooledByteBufAllocatorL.java:153)
 ~[lakesoul-io-java-2.6.1-shaded.jar:4.1.104.Final]
   at 
io.netty.buffer.PooledByteBufAllocatorL.(PooledByteBufAllocatorL.java:49) 
~[lakesoul-io-java-2.6.1-shaded.jar:4.1.104.Final]
   at 
org.apache.arrow.memory.NettyAllocationManager.(NettyAllocationManager.java:51)
 ~[arrow-memory-netty-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.DefaultAllocationManagerFactory.(DefaultAllocationManagerFactory.java:26)
 ~[arrow-memory-netty-15.0.2.jar:15.0.2]
   at java.lang.Class.forName0(Native Method) ~[?:?]
   at java.lang.Class.forName(Class.java:375) ~[?:?]
   at 
org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
 ~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
 ~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:773)
 ~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24) 
~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
 ~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.ImmutableConfig.(ImmutableConfig.java:47) 
~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.ImmutableConfig.(ImmutableConfig.java:24) 
~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485) 
~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.arrow.memory.BaseAllocator.(BaseAllocator.java:62) 
~[arrow-memory-core-15.0.2.jar:15.0.2]
   at 
org.apache.doris.service.arrowflight.DorisFlightSqlService.(DorisFlightSqlService.java:47)
 ~[doris-fe.jar:1.2-SNAPSHOT]
   at org.apache.doris.qe.QeService.start(QeService.java:68) 
~[doris-fe.jar:1.2-SNAPSHOT]
   at org.apache.doris.DorisFE.start(DorisFE.java:213) 
~[doris-fe.jar:1.2-SNAPSHOT]
   at org.apache.doris.DorisFE.main(DorisFE.java:95) 
~[doris-fe.jar:1.2-SNAPSHOT]
   ```
   
   
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++] Thirdparty: bump lz4 to 1.10.0 [arrow]

2024-07-31 Thread via GitHub


mapleFU opened a new issue, #43492:
URL: https://github.com/apache/arrow/issues/43492

   ### Describe the enhancement requested
   
   Seems it has performance improment: 
https://github.com/lz4/lz4/releases/tag/v1.10.0
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Inline parent validity bitmap into child validity bitmap [arrow]

2024-07-31 Thread via GitHub


takaaki7 opened a new issue, #43494:
URL: https://github.com/apache/arrow/issues/43494

   ### Describe the enhancement requested
   
   Currently, when struct itself is null, child's validity is unknown. So 
client must compute AND of those bitmaps to know child's validity.
   
   I think inlining parent validity into child validity is more effective for 
query engines.
   
   
   ### Component(s)
   
   Format


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Java] `new RootAllocator()` occasionally report error `java.lang.NoSuchFieldError: chunkSize` [arrow]

2024-07-31 Thread via GitHub


vibhatha closed issue #43491: [Java] `new RootAllocator()` occasionally  report 
error `java.lang.NoSuchFieldError: chunkSize`
URL: https://github.com/apache/arrow/issues/43491


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Dataset with Filename partitioning sometimes loses files on write [arrow]

2024-07-31 Thread via GitHub


rafal-c opened a new issue, #43496:
URL: https://github.com/apache/arrow/issues/43496

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Consider a simple program (code below) which creates a Table, turns it into 
a Dataset and writes the Dataset with Filename partitioning to a directory 
`/tmp/dataset`. Let's call it `myprogram`. Now if you run `myprogram` and look 
into /tmp/dataset repeatedly, this is what you may see:
   
   ```bash
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2020_part0.parquet  2021_part0.parquet  2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2020_part0.parquet  2021_part0.parquet  2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2020_part0.parquet  2021_part0.parquet  
2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2019_part0.parquet  2021_part0.parquet  2022_part0.parquet
   ➜  ./myprogram && ls /tmp/dataset
   2020_part0.parquet  2021_part0.parquet  2022_part0.parquet
   ```
   So for some reason it randomly skips parts of the dataset on write. This is 
not specific to Parquet and it happens on all major platforms 
(Linux/Windows/MacOS) with Arrow 16.0.0.
   
   Here is the full code to reproduce:
   
   ```cpp
   #include 
   #include 
   #include 
   
   arrow::Result> makeTable() {
   using arrow::field;
   auto schema = arrow::schema({field("a", arrow::int64()), field("year", 
arrow::int64())});
   
   std::vector> arrays(2);
   arrow::NumericBuilder builder;
   
   ARROW_RETURN_NOT_OK(builder.AppendValues({5, 2, 4, 100, 2, 4}));
   ARROW_RETURN_NOT_OK(builder.Finish(&arrays[0]));
   
   builder.Reset();
   
   ARROW_RETURN_NOT_OK(builder.AppendValues({2019, 2020, 2021, 2021, 2022, 
2022}));
   ARROW_RETURN_NOT_OK(builder.Finish(&arrays[1]));
   
   return arrow::Table::Make(schema, arrays);
   }
   
   int main() {
   namespace ds = arrow::dataset;
   
   // Create an Arrow Table
   auto table = makeTable().ValueOrDie();
   
   auto dataset = std::make_shared(table);
   
   auto scanner_builder = dataset->NewScan().ValueOrDie();
   
   auto scanner = scanner_builder->Finish().ValueOrDie();
   
   // The partition schema determines which fields are part of the 
partitioning.
   
   auto partition_schema = arrow::schema({arrow::field("year", 
arrow::int64())});
   
   auto partitioning = 
std::make_shared(partition_schema);
   
   // We'll write Parquet files.
   auto format = std::make_shared();
   
   ds::FileSystemDatasetWriteOptions write_options;
   
   write_options.file_write_options = format->DefaultWriteOptions();
   
   write_options.existing_data_behavior = 
ds::ExistingDataBehavior::kDeleteMatchingPartitions;
   
   write_options.filesystem = 
std::make_shared();;
   
   write_options.base_dir = "/tmp/dataset";
   
   write_options.partitioning = partitioning;
   
   write_options.basename_template = "part{i}.parquet";
   
   return ds::FileSystemDataset::Write(write_options, scanner) != 
arrow::Status::OK();
   }
   ```
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python][ppc64le][cuda] pytest segfault - test_cuda.py/test_foreign_buffer [arrow]

2024-07-31 Thread via GitHub


jorisvandenbossche closed issue #31432: [Python][ppc64le][cuda] pytest segfault 
- test_cuda.py/test_foreign_buffer
URL: https://github.com/apache/arrow/issues/31432


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python] StructArray.from_array() should accept a type (in addition to names or fields) [arrow]

2024-07-31 Thread via GitHub


AlenkaF closed issue #42014: [Python] StructArray.from_array() should accept a 
type (in addition to names or fields)
URL: https://github.com/apache/arrow/issues/42014


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Reading partial data/first block hangs on some cloud filesystems [arrow]

2024-07-31 Thread via GitHub


dberenbaum opened a new issue, #43497:
URL: https://github.com/apache/arrow/issues/43497

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Take the following example using a publicly available dataset:
   
   ```python
   import gcsfs
   from pyarrow.dataset import dataset
   
   # without fsspec filesystem, get segmentation fault
   fs = None 
   # with fsspec filesystem, hangs and never finishes
   # fs = gcsfs.GCSFileSystem()
   
   uri = 
"gs://datachain-demo/laion-aesthetics-csv/laion_aesthetics_1024_33M_1.csv"
   ds = dataset(uri, format="csv", filesystem=fs)
   print(ds.head(5))
   ```
   
   As noted in the comments, depending on which filesystem is passed, it will 
either hang indefinitely or hit a segmentation fault. Strangely, s3 paths work 
(don't hang or fail) with the pyarrow filesystem but hang with the fsspec s3fs 
filesystem.
   
   Other findings:
   - Similar operations like `ds.take()` and `next(ds.to_batches())` have the 
same behavior as `ds.head()`
   - `ds.head(use_threads=False)` completes successfully with any filesystem 
but takes much longer
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++] Benchmark Arrow BinaryViewBuilder [arrow]

2024-07-31 Thread via GitHub


felipecrv closed issue #43444: [C++] Benchmark Arrow BinaryViewBuilder
URL: https://github.com/apache/arrow/issues/43444


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] The flight.NewRecordWriter parameter is ambiguous [arrow]

2024-07-31 Thread via GitHub


joellubi closed issue #43443: The flight.NewRecordWriter parameter is ambiguous 
 
URL: https://github.com/apache/arrow/issues/43443


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] The flight.NewRecordWriter parameter is ambiguous [arrow]

2024-07-31 Thread via GitHub


joellubi closed issue #43443: The flight.NewRecordWriter parameter is ambiguous 
 
URL: https://github.com/apache/arrow/issues/43443


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [CI] Crossbow report shouldn't include jobs that aren't configured / run recently [arrow]

2024-07-31 Thread via GitHub


jonkeane opened a new issue, #43499:
URL: https://github.com/apache/arrow/issues/43499

   ### Describe the enhancement requested
   
   In #43451 we turned off a bunch of jobs that were constantly red, but I was 
surprised to see that they are still [showing up on the crossbow nightly 
report](http://crossbow.voltrondata.com) would it be possible to update that 
report so that jobs that haven't been run in the last 2-3 days aren't in the 
list of builds to show up?
   
   For example for `conda-win-x64-cuda-py3` the last run was three days ago, 
but it's still listed in the table + looks like it ran with the most recent 
group and failed.
   
   cc @boshek  
   
   ### Component(s)
   
   Continuous Integration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [R][CI] Bump dev docs CI job from ubuntu 20.04 [arrow]

2024-07-31 Thread via GitHub


jonkeane opened a new issue, #43500:
URL: https://github.com/apache/arrow/issues/43500

   ### Describe the enhancement requested
   
   Ubuntu 20.04 is quite old, let's use something more modern
   
   ### Component(s)
   
   Continuous Integration, R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [JAVA] Fix Java JNI / AMD64 manylinux2014 Java JNI test not test dataset module [arrow]

2024-07-31 Thread via GitHub


jinchengchenghh opened a new issue, #43502:
URL: https://github.com/apache/arrow/issues/43502

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   https://github.com/apache/arrow/pull/41646#issuecomment-2259855172
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [JS] bump `command-line-usage` for security [arrow]

2024-07-31 Thread via GitHub


bombard1004 opened a new issue, #43505:
URL: https://github.com/apache/arrow/issues/43505

   ### Describe the enhancement requested
   
   Npm package `apache-arrow` depends on `command-line-usage`. A [security 
vulnerability](https://github.com/advisories/GHSA-28mc-g557-92m7) was 
discovered in one of the dependencies of `command-line-usage`, and a patch has 
been released. However, `apache-arrow` has strictly specified the version of 
`command-line-usage`, which prevents this security patch from being applied.
   
   Therefore, the version of `command-line-usage` needs to be updated.
   
   ### Component(s)
   
   JavaScript


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   5   6   7   8   9   10   >