Re: [PR] HBASE-29585 Add row-level cache for the get operation [hbase]

via GitHub Sun, 14 Sep 2025 19:12:21 -0700


EungsopYoo commented on PR #7291:
URL: https://github.com/apache/hbase/pull/7291#issuecomment-3290136519


   > > > 3. Why not simply evict the row that got mutated? I guess we cannot 
simply override it in the cache because mutation can happen on individual cells.
   > > 
   > > 
   > > I didn’t fully understand the intention behind your question. Could you 
please explain it in more detail?
   > 
   > Rather than blocking writes to the row cache during updates/bulkload, can 
we simply make the updates evict/override the row from the cache if it's 
already there? For puts, we shouldn't need to worry about barries, if we make 
sure we don't cache the row if it's in the memstore only, but we should to make 
sure to remove it from the row cache because the cache would now be stale. For 
bulkloads, I guess we only need to make sure to evict the rows for affected 
regions after the bulkload has been committed.
   
   When the data exists in both the MemStore and the StoreFiles, we need to 
store it in the row cache to avoid result merging. In that case, due to the 
following issues, a barrier was introduced.
   
   | Thread | Time 1 | Time 2 | Time 3 | Time 4 |
   |-|-|-|-|-|
   | th1 | delete row1 from RowCache | Put row1 to Region | write row1 to 
RowCache | |
   | th2 | | delete row1 from RowCache | Put row1 to Region | write row1 to 
RowCache |
   | th3 | Get for row1 not from RowCache. Good | Get for row1 not from 
RowCache. Good | Get for row1 from stale RowCache. Bad | Get for row1 not from 
RowCache. Good |
   
   It would be more efficient to do as you mentioned when doing a bulkload.
   > > > 4. Are we accepting to have data duplicated over separate caches? I 
don't see any logic to avoid caching a whole block containing a region for a 
Get in the L2 cache, still we'll be cache the row in the row cache. Similarly, 
we might re-cache a row that's in the memstore in the row cache.
   > > 
   > > 
   > > This is in the same L1/L2 context as your comment 2, correct? If so, I 
haven’t considered that aspect yet, but I’ll start thinking about how to handle 
it.
   > > Since the row cache is only enabled when there are at least two HFiles, 
rows that exist only in the MemStore are not cached. However, when there are 
two or more HFiles, rows in MemStore are also added again to the row cache. 
This is an intentional design choice, aimed at bypassing the process of 
generating results via SegmentScanner and StoreFileScanner, and instead serving 
Get requests directly from the cache.
   > 
   > Per other comments, agree it's fine to have the row in the row cache and 
its' block also in the block cache. We need to decide if we want to add blocks 
to the block cache when doing Get, or Get should cache only in the row cache? 
Also, should we avoid caching if the row is the memstore? Could be challenging 
in the current design of caching the whole row, because memstore migh have only 
updates for few cells within a row.
   
   I already answered this in another comment, but I’ll respond here as well. 
   
   I think it’s better to put it into the BlockCache when doing a Get, 
according to the BlockCache setting.
   
   It is more efficient not to create a row cache when the cells to be fetched 
exist only in the MemStore. However, if the cells to be fetched are in both the 
MemStore and the StoreFiles, then creating a row cache is efficient to avoid 
result merging. 
   
   I’ll give some more thought on how we can achieve this.
   
   > > > 5. One problem of adding such small units (a single row) in the cache 
is that we need to keep a map index for each entry. So, the smaller the row in 
size, more rows would fit in the cache, but more key objects would be retained 
in the map. In your tests, assuming the default block cache size of 40% of the 
heap, it would give a 12.8GB of block cache. Have you managed to measure the 
block cache usage by the row cache, in terms of number of rows in the cache, 
byte size of the L1 cache and the total heap usage? Maybe wort collecting a 
heapdump to analyse the map index size in the heap.
   > > 
   > > 
   > > I slightly modified the LruBlockCache code to record the row cache size 
and entry count. The row cache occupies 268.67MB with 338,602 entries. The 
average size of a single row cache entry is 830 bytes. Within the overall 
BlockCache, the row cache accounts for 45% by entry count and 2% by size.
   > > ```
   > > 2025-09-12T09:08:44,112 INFO  [LruBlockCacheStatsExecutor {}] 
hfile.LruBlockCache: totalSize=12.80 GB, usedSize=12.48 GB, freeSize=329.41 MB, 
max=12.80 GB, blockCount=752084, accesses=35942999, hits=27403857, 
hitRatio=76.24%, , cachingAccesses=35942954, cachingHits=27403860, 
cachingHitsRatio=76.24%, evictions=170, evicted=5806436, 
evictedPerRun=34155.50588235294, rowBlockCount=338602, rowBlockSize=268.67 MB
   > > ```
   > 
   > What if more rows get cached, over time, as more gets for different rows 
are executed? It could lead to many rows in the cache, and many more objects in 
the map to index it. In the recent past. we've seen some heap issues when 
having very large file based bucket cache and small compressed blocks. I guess 
we could face similar problems here too.
   
   Okay. Then I’ll take a heap dump and check the size of the map’s index.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29585 Add row-level cache for the get operation [hbase]

Reply via email to