mderoy opened a new issue, #43304: URL: https://github.com/apache/arrow/issues/43304
### Describe the bug, including details regarding any error messages, version, and platform. There appears to be some kind of issue populating cache entries, which allows a null value to be inserted ``` #3 <signal handler called> #4 0x00007f80cb354fb3 in arrow::Buffer::Buffer (size=34443, offset=0, parent=<synthetic pointer>..., this=0x4737e60) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:1012 #5 __gnu_cxx::new_allocator<arrow::Buffer>::construct<arrow::Buffer, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (this=<optimized out>, __p=0x4737e60) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/ext/new_allocator.h:147 #6 std::allocator_traits<std::allocator<arrow::Buffer> >::construct<arrow::Buffer, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__a=..., __p=0x4737e60) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/alloc_traits.h:484 #7 std::_Sp_counted_ptr_inplace<arrow::Buffer, std::allocator<arrow::Buffer>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__a=..., this=0x4737e50) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:548 #8 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<arrow::Buffer, std::allocator<arrow::Buffer>, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__a=..., __p=<optimized out>, this=<optimized out>) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:679 #9 std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<arrow::Buffer>, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__tag=..., this=<optimized out>) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:1344 #10 std::shared_ptr<arrow::Buffer>::shared_ptr<std::allocator<arrow::Buffer>, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__tag=..., this=<optimized out>) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:359 #11 std::allocate_shared<arrow::Buffer, std::allocator<arrow::Buffer>, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__a=...) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:702 #12 std::make_shared<arrow::Buffer, std::shared_ptr<arrow::Buffer> const&, long const&, long const&> () at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:718 #13 arrow::SliceBuffer (length=34443, offset=0, buffer=<synthetic pointer>...) at /home/mkderoy/ws/arrow/cpp/src/arrow/buffer.h:329 #14 arrow::io::internal::ReadRangeCache::Impl::Read (this=<optimized out>, range=...) at /home/mkderoy/ws/arrow/cpp/src/arrow/io/caching.cc:221 #15 0x00007f80cb351d6e in arrow::io::internal::ReadRangeCache::Read (this=<optimized out>, range=...) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/unique_ptr.h:352 #16 0x00007f80cd266d6c in parquet::SerializedRowGroup::GetColumnPageReader (this=0x4665940, i=4) at /home/mkderoy/ws/arrow/cpp/src/parquet/file_reader.cc:211 #17 0x00007f80cd25fc5d in parquet::RowGroupReader::Column (this=0x4749540, i=i@entry=4) at /home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/unique_ptr.h:352 ``` When looking at the buffer retried from the future in frame 13 `>│221 return SliceBuffer(std::move(buf), range.offset - it->range.offset, range.length);` you'll see a shared pointer to 0x0 ``` $52 = { <std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>> = { <std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, members of std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>: _M_ptr = 0x0, _M_refcount = { _M_pi = 0x0 } }, <No data fields>} ``` Our application is using the parquet library directly and prebuffering the relevant columns in the rowgroup, waiting for prebuffering, then opening column readers for the relevant columns... while opening these columnreaders we sometimes encounter a null cache entry. At this point we've already called WhenBuffered() and the rowgroup should be completely prebuffered. the future in question is not marked failed. Cache options $57 = { static kDefaultIdealBandwidthUtilizationFrac = 0.90000000000000002, static kDefaultMaxIdealRequestSizeMib = 64, hole_size_limit = 8192, range_size_limit = 33554432, lazy = false, prefetch_limit = 0 } I do not think it will be easy for someone to repro our problem. It only happens after several hours with lots of parallel jobs. By posting this I'm mainly hoping for some pointers for debugging and fixing the issue. We're using Arrow 14 currently, but I don't see any relevant commits that may have addressed this in the relevant files (caching.cc, future.cc, etc) ### Component(s) C++, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org