[jira] [Updated] (HBASE-20525) Refactoring the code of read path

Duo Zhang (Jira) Sun, 31 Aug 2025 04:43:34 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Duo Zhang updated HBASE-20525:
------------------------------
    Fix Version/s: 4.0.0-alpha-1
                       (was: 3.0.0-beta-2)

> Refactoring the code of read path
> ---------------------------------
>
>                 Key: HBASE-20525
>                 URL: https://issues.apache.org/jira/browse/HBASE-20525
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Scanners
>            Reporter: Duo Zhang
>            Priority: Critical
>             Fix For: 4.0.0-alpha-1
>
>
> The known problems of the current implementation:
> 1. 'Seek or skip' should be decided at StoreFileScanner level, not 
> StoreScanner.
> 2. As now we support creating multiple StoreFileReader instances for a single 
> HFile, we do not need to load the file info and other meta infos every time 
> when creating a new StoreFileReader instance.
> 3. 'Pread or stream' should be decided at StoreFileScanner level, not 
> StoreScanner.
> 4. Make sure that we can return at any point during a scan, at least when 
> filterRowKey we can not stop until we reach the next row, no matter how many 
> cells we need to skip...
> 5. Doing bytes comparing everywhere, where we need to know if there is a row 
> change, a family change, a qualifier change, etc. This is a performance 
> killer.
> And the most important thing is that, the code is way too complicated now and 
> become out of control...
> This should be done before our 3.0.0 release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-20525) Refactoring the code of read path

Reply via email to