vc77 opened a new issue, #47577:
URL: https://github.com/apache/arrow/issues/47577
### Describe the usage question you have. Please include as many useful
details as possible.
I'm hoping someone with experience in the Arrow build system can help with a
persistent compilation issue.
I have successfully built and installed the Arrow C++ libraries from the git
main branch into a local, isolated directory (~/arrow-install). The C++ build
stage completes without any errors.
The problem occurs when I then try to build the pyarrow library against that
local C++ installation. The Python build fails with a C++ compiler error.
Environment:
OS: Arch Linux
Python: 3.9.18 (managed via pyenv)
Compiler: GCC
Arrow Version: Cloned from git main
Dependencies: All dependencies, including NumPy C headers (numpy[dev]),
are installed.
Process:
Part 1: Successful C++ Build
The C++ libraries were built and installed to ~/arrow-install using the
following cmake command. This process finished successfully.
CMake
cmake .. \
-DCMAKE_INSTALL_PREFIX=~/arrow-install \
-DPython3_EXECUTABLE=$(pyenv which python) \
-DARROW_PYTHON=ON \
-DARROW_COMPUTE=ON \
-DARROW_BUILD_SHARED=ON \
-DARROW_DEPENDENCY_SOURCE=BUNDLED
Part 2: Failed Python Build
I then attempt to build pyarrow using the following script, which is what
produced the attached log file:
Bash
# Set environment to point to the clean C++ install
export ARROW_HOME=~/arrow-install
export PKG_CONFIG_PATH=~/arrow-install/lib/pkgconfig:$PKG_CONFIG_PATH
export LD_LIBRARY_PATH=~/arrow-install/lib:$LD_LIBRARY_PATH
# Navigate to Python source and clean
cd /path/to/arrow/python
git clean -fdx .
# Install build dependencies
$(pyenv which python) -m pip install --upgrade pip setuptools wheel cython
"numpy<2"
$(pyenv which python) -m pip install -r requirements-build.txt
# Run the build and capture output
$(pyenv which python) setup.py build_ext --inplace > ~/pyarrow_build_log.txt
2>&1
Result & Analysis:
The setup.py build_ext command fails during C++ compilation. The error
appears to be a signature mismatch in parquet_encryption.h related to
KmsClient's use of std::string vs arrow::util::SecureString. This suggests the
build process is somehow finding and using conflicting headers from an old
installation, despite the environment variables pointing to the new, clean
installation at ~/arrow-install.
The attached log is the complete, verbose output from the failing setup.py
build_ext command.
Full Python Build Log:
[https://gist.github.com/vc77/a5f9c8db58ffe33b1a928dd28ada9c4c)]
I'm stuck as to why the build isn't exclusively using the headers and
libraries from the specified ARROW_HOME. Any insight would be greatly
appreciated.
Thank you.
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]