mderoy opened a new issue, #43304:
URL: https://github.com/apache/arrow/issues/43304
### Describe the bug, including details regarding any error messages,
version, and platform.
There appears to be some kind of issue populating cache entries, which
allows a null value to be inserted
```
#3 <signal handler called>
#4 0x00007f80cb354fb3 in arrow::Buffer::Buffer (size=34443, offset=0,
parent=<synthetic pointer>..., this=0x4737e60)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:1012
#5 __gnu_cxx::new_allocator<arrow::Buffer>::construct<arrow::Buffer,
std::shared_ptr<arrow::Buffer> const&, long const&, long const&>
(this=<optimized out>, __p=0x4737e60)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/ext/new_allocator.h:147
#6 std::allocator_traits<std::allocator<arrow::Buffer>
>::construct<arrow::Buffer, std::shared_ptr<arrow::Buffer> const&, long const&,
long const&> (__a=..., __p=0x4737e60)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/alloc_traits.h:484
#7 std::_Sp_counted_ptr_inplace<arrow::Buffer,
std::allocator<arrow::Buffer>,
(__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<arrow::Buffer>
const&, long const&, long const&> (__a=..., this=0x4737e50) at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:548
#8
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<arrow::Buffer,
std::allocator<arrow::Buffer>, std::shared_ptr<arrow::Buffer> const&, long
const&, long const&> (__a=...,
__p=<optimized out>, this=<optimized out>) at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:679
#9 std::__shared_ptr<arrow::Buffer,
(__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<arrow::Buffer>,
std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__tag=...,
this=<optimized out>) at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr_base.h:1344
#10
std::shared_ptr<arrow::Buffer>::shared_ptr<std::allocator<arrow::Buffer>,
std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__tag=...,
this=<optimized out>)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:359
#11 std::allocate_shared<arrow::Buffer, std::allocator<arrow::Buffer>,
std::shared_ptr<arrow::Buffer> const&, long const&, long const&> (__a=...)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:702
#12 std::make_shared<arrow::Buffer, std::shared_ptr<arrow::Buffer> const&,
long const&, long const&> ()
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/shared_ptr.h:718
#13 arrow::SliceBuffer (length=34443, offset=0, buffer=<synthetic
pointer>...) at /home/mkderoy/ws/arrow/cpp/src/arrow/buffer.h:329
#14 arrow::io::internal::ReadRangeCache::Impl::Read (this=<optimized out>,
range=...) at /home/mkderoy/ws/arrow/cpp/src/arrow/io/caching.cc:221
#15 0x00007f80cb351d6e in arrow::io::internal::ReadRangeCache::Read
(this=<optimized out>, range=...)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/unique_ptr.h:352
#16 0x00007f80cd266d6c in parquet::SerializedRowGroup::GetColumnPageReader
(this=0x4665940, i=4) at
/home/mkderoy/ws/arrow/cpp/src/parquet/file_reader.cc:211
#17 0x00007f80cd25fc5d in parquet::RowGroupReader::Column (this=0x4749540,
i=i@entry=4)
at
/home/mkderoy/ws/ws2/main/obj/cc/generic/x86_64-generic-linux-gnu/include/c++/9.2.0/bits/unique_ptr.h:352
```
When looking at the buffer retried from the future in frame 13
`>│221 return SliceBuffer(std::move(buf), range.offset -
it->range.offset, range.length);`
you'll see a shared pointer to 0x0
```
$52 = {
<std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>> = {
<std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2,
false, false>> = {<No data fields>},
members of std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>:
_M_ptr = 0x0,
_M_refcount = {
_M_pi = 0x0
}
}, <No data fields>}
```
Our application is using the parquet library directly and prebuffering the
relevant columns in the rowgroup, waiting for prebuffering, then opening column
readers for the relevant columns... while opening these columnreaders we
sometimes encounter a null cache entry. At this point we've already called
WhenBuffered() and the rowgroup should be completely prebuffered.
the future in question is not marked failed.
Cache options
$57 = {
static kDefaultIdealBandwidthUtilizationFrac = 0.90000000000000002,
static kDefaultMaxIdealRequestSizeMib = 64,
hole_size_limit = 8192,
range_size_limit = 33554432,
lazy = false,
prefetch_limit = 0
}
I do not think it will be easy for someone to repro our problem. It only
happens after several hours with lots of parallel jobs. By posting this I'm
mainly hoping for some pointers for debugging and fixing the issue. We're using
Arrow 14 currently, but I don't see any relevant commits that may have
addressed this in the relevant files (caching.cc, future.cc, etc)
### Component(s)
C++, Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]