Hey all,

We have a use-case for HFile.Reader at my company. The use-case in question
is scanning many hfiles for a small subset of cells, so it mostly is
just iterating a large number of HFiles and discarding most of the cells.
We recently upgraded that deployable from super old cdh 5.16 (hbase
1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance regression
of around 4x. I imagine this regression would also affect
ClientSideRegionScanner.

We did some profiling and noticed that a large amount of time is spent in
SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method calls
checkRefCnt and this checks compares a volatile
in AtomicIntegerFieldUpdater in the netty code.

I believe ref counting is mostly necessary for off-heap buffers, but
on-heap buffers are also wrapped in SingleByteBuff and so also go through
checkRefCnt. We removed the checkRefCnt call, and the regression
disappeared.

We created a simple test method which just does HFile.createReader,
reader.getScanner(), and then iterates the scanner counting the total
cells. The test reads an hfile with 100M cells and takes  over 1 minute
with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s.

It's worth noting that the regression is most prominent on java17. It's
slightly less obvious on java11, with runtime being 40s vs 28s.

Thoughts on updating SingleByteBuff to skip the checkRefCnt call for
on-heap buffers? We can handle this in the constructor, when wrapping an
on-heap buffer here [1].

[1]
https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300

Reply via email to