jorisvandenbossche opened a new issue, #43511: URL: https://github.com/apache/arrow/issues/43511
For https://github.com/apache/arrow/issues/41665 (implemented for Array in https://github.com/apache/arrow/issues/42112 / https://github.com/apache/arrow/pull/42113), we currently use the following assertion to check if the data is on CPU (and thus supports the operation in question that access the data's address): https://github.com/apache/arrow/blob/d4d92e4896d8108aef25c6ef199e87890d027b22/python/pyarrow/array.pxi#L2035-L2037 This checks explicitly for the CPU device allocation type. However, this means that for example data with a CUDA_HOST device type, which is actually accessible from the CPU, will trigger this error: ```python import numpy as np import pyarrow as pa from pyarrow import cuda # create Array with CudaHost buffer buf = cuda.new_host_buffer(5*8) np.frombuffer(buf, dtype=np.int64)[:] = range(5) arr = pa.Array.from_buffers(pa.int64(), size, [None, buf]) # inspect the array >>> arr <pyarrow.lib.Int64Array object at 0x7f24b6e02e00> [ 0, 1, 2, 3, 4 ] >>> arr.device_type <DeviceAllocationType.CUDA_HOST: 3> # calling a method that checks _assert_cpu errors >>> arr.sum() ... NotImplementedError: Implemented only for data on CPU device # but the underlying buffer itself "is_cpu" >>> arr.buffers()[1] <pyarrow.Buffer address=0x7f24c1600400 size=80 is_cpu=True is_mutable=True> >>> arr.buffers()[1].is_cpu True >>> arr.buffers()[1].device_type <DeviceAllocationType.CUDA_HOST: 3> ``` At the buffer level we have this `is_cpu` attribute available, but currently on the Array level we only have `device_type()`. We could add CUDA_HOST device allocation type explicitly to the check above, but ideally we would use something more general? (cc @danepitkin) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
