wiredfool opened a new issue, #43831:
URL: https://github.com/apache/arrow/issues/43831

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   I'm implementing support for the Arrow PyCapsule Protocol in Pillow, as 
referenced here: https://github.com/python-pillow/Pillow/issues/8329, 
implementation here: https://github.com/python-pillow/Pillow/pull/8330
   
   There are a couple of implementation questions that arise from it: 
   
   Internally, we store images as a binary chunk, in full raster lines up to 
16MB. Above that, the images overflow to the next chunk. There's a variable 
amount of dead space between the end of the last scan line up to the 16mb 
point. So for the simple, small image case, we can just point at this memory as 
the array buffer. 
   
   Is an `__arrow_c_stream__` the best way to implement what would effectively 
be chunked arrays?  Is there a way in the protocol to fall back from the 
`__arrow_c_array__` to the stream on err/null? For our purposes, a stream is 
likely as lightweight to provide as an array. 
   
   Is there a preferred array representation of Image raster data?  There are a 
few possible, but I'd like to provide something that looks vaguely like a 
standard. FWIW, at the moment, the numpy array interface does return a shaped 
array, so the dimensions of the image are available. 
     - Flat array `arr[(y*(width)+x)*4 + channel]`
     - or Fixed Pixel array `arr[y*(width)+x][channel]`? 
     - Would it make sense to embed this into a set of FixedArrays that are a 
line length, `arr[y][x][channel]`?
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to