metsw24-max opened a new issue, #50161: URL: https://github.com/apache/arrow/issues/50161
### Describe the bug, including details regarding any error messages, version, and platform. `ReadSparseCSFIndex` (cpp/src/arrow/ipc/reader.cc) sizes its `indptr_data`/`indices_data` vectors from the tensor shape (`ndim - 1` and `ndim`) but fills them by looping over the flatbuffer-supplied `indptrBuffers()->size()`/`indicesBuffers()->size()`, which are independent fields and never checked against `ndim`. A crafted IPC SparseTensor CSF message with more buffer entries than dimensions writes `shared_ptr<Buffer>` elements past the end of those vectors (heap out-of-bounds write). `ndim == 0` is also accepted and builds a vector of size `SIZE_MAX`. The payload path is guarded by `CheckSparseTensorBodyBufferCount`, but the file/stream path is not, so this is reachable from the public `ReadSparseTensor(io::InputStream*)` API on untrusted bytes. `GetSparseCSFIndexMetadata` (cpp/src/arrow/ipc/metadata_internal.cc) has the same shape: it loops over `axisOrder()->size()` while indexing `indicesBuffers()->Get(i)` without checking the two lengths match, an out-of-bounds read reachable from both paths. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
