Re: [I] [Python] Support for the free-threaded build of CPython 3.13 [arrow]

2024-10-13 Thread via GitHub


raulcd closed issue #43536: [Python] Support for the free-threaded build of 
CPython 3.13
URL: https://github.com/apache/arrow/issues/43536


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Python][Packaging] Support Python 3.13 and upload wheels [arrow]

2024-10-13 Thread via GitHub


raulcd closed issue #43519: [Python][Packaging] Support Python 3.13 and upload 
wheels
URL: https://github.com/apache/arrow/issues/43519


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [C++][Compute] "Scatter" vector functions [arrow]

2024-10-13 Thread via GitHub


zanmato1984 opened a new issue, #44393:
URL: https://github.com/apache/arrow/issues/44393

   ### Describe the enhancement requested
   
   We discussed the solution for #41094 , the conclusion is that the "special 
form" is the way. Comment 
https://github.com/apache/arrow/issues/41094#issuecomment-2087716483 gives a 
thorough description of how special forms work.
   
   Here I summarize a bit: a special form "mask-ably" evaluates some of its 
subexpressions based on some masks obtained from its other subexpressions. For 
example consider `if cond then expr1 else expr2`, the result of `cond` is the 
mask, which controls which rows goes to `expr1` and which goes to `expr2`. 
Another example is logical `and`/`or`, each of its subexpressions is part of 
the mask to evaluate the rest subexpressions (boolean short-circuit).
   
   One way to implement special forms is that **every** expression selectively 
executes its kernel by respecting a selection vector (which rows this kernel 
should execute on) or a equally boolean mask. But unfortunately this isn't 
practical because we can't afford to change every (scalar) compute functions to 
support selection vector/mask all at once. So we must take an adaptive way, 
allowing functions to be selection vector/mask agnostic. To do so, a special 
form should 1) takes rows specific to each branch; 2) invoke the function of 
each branch on each group of these rows; 3) combine the results of all the 
branches by scattering each row to its original position in the input.
   
   So far we have vector function `filter`/`take` to do 1), but there isn't a 
handy utility to do 3).
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Proposal to implement ADBC driver for Apache Cassandra [arrow-adbc]

2024-10-13 Thread via GitHub


SChakravorti21 opened a new issue, #2245:
URL: https://github.com/apache/arrow-adbc/issues/2245

   ### What feature or improvement would you like to see?
   
   There isn't an existing ADBC driver for Cassandra as far as I can tell, and 
it would be great to have one! I'm interested in starting this effort as I have 
experience getting Arrow data to/from Cassandra, and have a little experience 
working on an ADBC driver for a different database 
([comdb2](https://github.com/bloomberg/comdb2)). I met @zeroshade at Community 
Over Code recently, who inspired me to start the discussion around creating a 
Cassandra driver :)
   
   Some initial thoughts:
   
   - **Choice of language**
   
 - I'm personally most familiar with the Cassandra C/C++ driver as well as 
Arrow C++. However, if there's good reason to implement the driver in a 
different language, I'm open to that and happy to get up to speed.
   
 - Matt explained that it would be better to use nanoarrow rather than 
Arrow C++ as the latter is a heavy dependency and can complicate 
building/deploying drivers. Using nanoarrow sounds like a good idea to me.
   
   - **Implementation considerations**
   
 - Cassandra currently does not offer any native mechanism for 
fetching/ingesting data in Arrow format, so we would likely to have to 
implement row ↔ column transposition on the client side (in the driver).
 
 - The Cassandra Query Language (CQL) can be thought of as an extremely 
limited subset of SQL. [This StackOverflow 
answer](https://stackoverflow.com/a/19140553) is a good overview of the general 
limitations. I figure this shouldn't matter as far as implementing an ADBC 
driver is concerned, but thought it was worth mentioning in case I'm wrong.
   
 - Matt also mentioned that there is now an ADBC [driver 
framework](https://github.com/apache/arrow-adbc/blob/main/c/driver/framework). 
I don't see any reason not to use this. If we find any gaps in the framework 
while implementing the driver, I'm happy to help fill them in.
   
   - **First step(s)**
   
 - Matt mentioned that, before implementing anything, it would be good to 
stand up a Cassandra node/cluster in CI so that others can also play around 
with and contribute to the driver.
   
 - I suppose the next step would be to configure the build system to pull 
in the necessary dependencies (like the Cassandra C/C++ driver).
 
 - ... Start implementing the driver along with integration tests?
   
   I'd love to hear any other considerations for implementing this ADBC driver 
and/or recommendations on getting started!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] The CI pipeline for AMD64 Ubuntu 22.04 C++ failed. [arrow]

2024-10-13 Thread via GitHub


ripplehang opened a new issue, #44396:
URL: https://github.com/apache/arrow/issues/44396

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   
https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601
   Run sudo apt update
 sudo apt update
 sudo apt install -y --no-install-recommends python3 python3-dev python3-pip
 python3 -m pip install -U pip
 shell: /usr/bin/bash -e {0}
 env:
   ARCHERY_DEBUG: 1
   ARROW_ENABLE_TIMING_TESTS: OFF
   DOCKER_VOLUME_PREFIX: .docker/
   ARCH: amd64
   ARROW_SIMD_LEVEL: 
   CLANG_TOOLS: 14
   LLVM: 14
   UBUNTU: 
[2](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:2)2.04
   
   WARNING: apt does not have a stable CLI interface. Use with caution in 
scripts.
   
   Hit:1 http://azure.archive.ubuntu.com/ubuntu noble InRelease
   Get:2 http://azure.archive.ubuntu.com/ubuntu noble-updates InRelease [126 kB]
   
Hit:[3](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:3)
 http://azure.archive.ubuntu.com/ubuntu noble-backports InRelease
   Get:4 http://azure.archive.ubuntu.com/ubuntu noble-security InRelease [126 
kB]
   Hit:5 https://packages.microsoft.com/repos/azure-cli noble InRelease
   Hit:6 
https://packages.microsoft.com/ubuntu/2[4](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:4).04/prod
 noble InRelease
   Get:7 http://azure.archive.ubuntu.com/ubuntu noble-updates/main amd64 
Packages 
[[5](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:5)42
 kB]
   Get:8 http://azure.archive.ubuntu.com/ubuntu noble-updates/main 
Translation-en [133 kB]
   Get:9 http://azure.archive.ubuntu.com/ubuntu noble-updates/main 
amd[6](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:6)4
 c-n-f Metadata [9048 B]
   Get:10 http://azure.archive.ubuntu.com/ubuntu noble-updates/universe amd64 
Packages [386 kB]
   Get:11 http://azure.archive.ubuntu.com/ubuntu noble-updates/universe 
Translation-en [160 kB]
   Get:12 http://azure.archive.ubuntu.com/ubuntu noble-updates/universe amd64 
c-n-f Metadata [15.0 kB]
   Get:13 http://azure.archive.ubuntu.com/ubuntu noble-security/main amd64 
Packages [384 kB]
   Get:14 http://azure.archive.ubuntu.com/ubuntu noble-security/main 
Translation-en [84.6 kB]
   Get:15 http://azure.archive.ubuntu.com/ubuntu noble-security/main amd64 
c-n-f Metadata 
[4[7](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:7)08
 B]
   Get:16 http://azure.archive.ubuntu.com/ubuntu noble-security/universe amd64 
Packages 
[27[8](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:8)
 kB]
   Get:17 http://azure.archive.ubuntu.com/ubuntu noble-security/universe 
Translation-en [117 kB]
   Get:18 http://azure.archive.ubuntu.com/ubuntu noble-security/universe amd64 
c-n-f Metadata 
[[10](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:10).4
 kB]
   Fetched 2377 kB in 0s (5788 kB/s)
   Reading package lists...
   Building dependency tree...
   Reading state information...
   32 packages can be upgraded. Run 'apt list --upgradable' to see them.
   
   WARNING: apt does not have a stable CLI interface. Use with caution in 
scripts.
   
   Reading package lists...
   Building dependency tree...
   Reading state information...
   python3 is already the newest version 
(3.[12](https://github.com/apache/arrow/actions/runs/11304742428/job/31457900241?pr=43601#step:4:12).3-0ubuntu2).
   python3-dev is already the newest version (3.12.3-0ubuntu2).
   python3-pip is already the newest version (24.0+dfsg-1ubuntu1).
   0 upgraded, 0 newly installed, 0 to remove and 32 not upgraded.
   error: externally-managed-environment
   
   × This environment is externally managed
   ╰─> To install Python packages system-wide, try apt install
   python3-xyz, where xyz is the package you are trying to
   install.
   
   If you wish to install a non-Debian-packaged Python package,
   create a virtual environment using python3 -m venv path/to/venv.
   Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
   sure you have python3-full installed.
   
   If you wish to install a non-Debian packaged Python application,
   it may be easiest to use pipx install xyz, which will manage a
   virtual environment for you. Make sure you have pipx installed.
   
   See /usr/share/doc/python3.12/README.venv for more information.
   
   note: If you believe this is a mistake, please contact your Python 
installation or OS distribution provider. You can override this, at the risk of 
breaking your Python installation or OS, by passing --break-system-packages.
   hint: See PEP 668 for the detailed specification.
   Error: Proces