For those following along, I’ve submitted an initial PR for this issue.
https://github.com/apache/hbase/pull/5104

On Mon, Mar 13, 2023 at 11:46 AM Bryan Beaudreault <[email protected]>
wrote:

> Thanks everyone for chiming in. I submitted
> https://issues.apache.org/jira/browse/HBASE-27710 and will work on it in
> the near future.
>
> This does make me wonder what overhead checkRefCount causes in off-heap
> cases too, but we just dont see it in profiling because the server does so
> much other work. It has me thinking if there are other optimizations we
> could do there. For example, currently we do checkRefCount for every single
> ByteBuff operation (slice/duplicate/get/etc/etc). A lot of the buffer
> operations happen before creating a Cell for return to user in
> HFileBlock/DBE. I wonder if we could do all the parsing we need in
> HFileBlock/DBE without checkRefCount, do a single checkRefCount call when
> we create the Cell, then enable checkRefCount for all future ByteBuff calls
> before returning a cell to upper levels. This would probably be a larger
> project, so not investigating now. Just an idea i had.
>
> On Mon, Mar 13, 2023 at 2:06 AM Viraj Jasani <[email protected]> wrote:
>
>> +1, as the numbers suggest significant perf improvement.
>>
>>
>> On Thu, Mar 9, 2023 at 9:36 AM Bryan Beaudreault <[email protected]
>> >
>> wrote:
>>
>> > Hey all,
>> >
>> > We have a use-case for HFile.Reader at my company. The use-case in
>> question
>> > is scanning many hfiles for a small subset of cells, so it mostly is
>> > just iterating a large number of HFiles and discarding most of the
>> cells.
>> > We recently upgraded that deployable from super old cdh 5.16 (hbase
>> > 1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance
>> regression
>> > of around 4x. I imagine this regression would also affect
>> > ClientSideRegionScanner.
>> >
>> > We did some profiling and noticed that a large amount of time is spent
>> in
>> > SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method
>> calls
>> > checkRefCnt and this checks compares a volatile
>> > in AtomicIntegerFieldUpdater in the netty code.
>> >
>> > I believe ref counting is mostly necessary for off-heap buffers, but
>> > on-heap buffers are also wrapped in SingleByteBuff and so also go
>> through
>> > checkRefCnt. We removed the checkRefCnt call, and the regression
>> > disappeared.
>> >
>> > We created a simple test method which just does HFile.createReader,
>> > reader.getScanner(), and then iterates the scanner counting the total
>> > cells. The test reads an hfile with 100M cells and takes  over 1 minute
>> > with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s.
>> >
>> > It's worth noting that the regression is most prominent on java17. It's
>> > slightly less obvious on java11, with runtime being 40s vs 28s.
>> >
>> > Thoughts on updating SingleByteBuff to skip the checkRefCnt call for
>> > on-heap buffers? We can handle this in the constructor, when wrapping an
>> > on-heap buffer here [1].
>> >
>> > [1]
>> >
>> >
>> https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300
>> >
>>
>

Reply via email to