[
https://issues.apache.org/jira/browse/HBASE-29863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18069273#comment-18069273
]
Hudson commented on HBASE-29863:
--------------------------------
Results for branch branch-3
[build #22 on
builds.a.o|https://ci-hbase.apache.org/job/HBase-Integration-Test/job/branch-3/22/]:
(/) *{color:green}+1 overall{color}*
----
details (if available):
(/) {color:green}+1 client integration test for 3.3.5 {color}
(/) {color:green}+1 client integration test for 3.3.5 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.3.6 {color}
(/) {color:green}+1 client integration test for 3.3.6 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.0 {color}
(/) {color:green}+1 client integration test for 3.4.0 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.1 {color}
(/) {color:green}+1 client integration test for 3.4.1 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.2 {color}
(/) {color:green}+1 client integration test for 3.4.2 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.3 {color}
(/) {color:green}+1 client integration test for 3.4.3 with shaded hadoop
client{color}
> Add API to KeyValueScanner to retrieve the set of StoreFiles accessed during
> a scan
> -----------------------------------------------------------------------------------
>
> Key: HBASE-29863
> URL: https://issues.apache.org/jira/browse/HBASE-29863
> Project: HBase
> Issue Type: New Feature
> Components: API, regionserver, Scanners
> Reporter: Himanshu Gwalani
> Assignee: Himanshu Gwalani
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> *Goal:* Introduce a mechanism to track and expose the specific HFiles
> involved in a scan operation.
> {*}Use-case{*}: This is essential for validations on client side to ensure
> right set of files are scanned (if source of truth is available, for example:
> snapshot data manifest during snapshot based scans), debugging performance
> related issues and analysis on data access patterns.
> *Proposed API* Add {{Set<Path> getScannerInitializedFiles()}} to the
> {{KeyValueScanner}} interface.
> *Implementation Details*
> * *Capturing list of files when scanner is initialized.*
> ** Leaf Scanners
> *** StoreFileScanner: Returns singleton having the path of the associated
> {{{}HFile{}}}.
> *** SnapshotSegmentScanner / CollectionBackedScanner / SegmentScanner:
> Returns empty set.
> ** Composite Scanners
> *** StoreScanner & ReversedStoreScanner: Aggregates files from all active
> {{StoreFileScanners}}
> *** KeyValueHeap & ReversedKeyValueHeap: Aggregates files from its internal
> priority queue of scanners.
> ** Abstract Scanners
> *** NonLazyKeyValueScanner / NonReversedNonLazyKeyValueScanner: Returns
> empty set.{*}{{*}}
> * *Exposing via RegionScanner & TableSnapshotRecordReader*
> ** RegionScanner: Aggregates files from all underlying StoreScanners
> ** TableSnapshotRecordReader: Proxies the call through
> ClientSideRegionScanner to allow MapReduce jobs to access this for
> snapshot-based scans.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)