metsw24-max opened a new issue, #50161:
URL: https://github.com/apache/arrow/issues/50161

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   `ReadSparseCSFIndex` (cpp/src/arrow/ipc/reader.cc) sizes its 
`indptr_data`/`indices_data` vectors from the tensor shape (`ndim - 1` and 
`ndim`) but fills them by looping over the flatbuffer-supplied 
`indptrBuffers()->size()`/`indicesBuffers()->size()`, which are independent 
fields and never checked against `ndim`. A crafted IPC SparseTensor CSF message 
with more buffer entries than dimensions writes `shared_ptr<Buffer>` elements 
past the end of those vectors (heap out-of-bounds write). `ndim == 0` is also 
accepted and builds a vector of size `SIZE_MAX`.
   
   The payload path is guarded by `CheckSparseTensorBodyBufferCount`, but the 
file/stream path is not, so this is reachable from the public 
`ReadSparseTensor(io::InputStream*)` API on untrusted bytes.
   
   `GetSparseCSFIndexMetadata` (cpp/src/arrow/ipc/metadata_internal.cc) has the 
same shape: it loops over `axisOrder()->size()` while indexing 
`indicesBuffers()->Get(i)` without checking the two lengths match, an 
out-of-bounds read reachable from both paths.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to