Re: why does the scanner api have only startRow and stopRow and not also a count? was: Improving HBase scanner

TuX RaceR Tue, 04 May 2010 10:18:51 -0700

Thanks a lot Gary: I had missed this one
cheers
TuX


Gary Helmling wrote:

You can always add a PageFilter to your Scan instance to achieve this:
http://hadoop.apache.org/hbase/docs/r0.20.3/api/org/apache/hadoop/hbase/filter/PageFilter.html

Just be aware that you should still count on the client side if you want to
strictly limit to a given size.  Since the filter is applied independently
on each regionserver, the client can still receive back more than the page
size # of items.

--gh


On Tue, May 4, 2010 at 12:02 PM, TuX RaceR <[email protected]> wrote:

Hi Hbase users,

question related to the previous one, if we want to limit the amount of
data retrieved by a a scanner, can we tell to not scan after a number of
rows is reached?
If I look at another KV store (cassandra) the equivalent of the scan API
uses there a


    KeyRange

object, see
http://wiki.apache.org/cassandra/API

*Attribute*



*Type*



*Default*



*Required*



*Description*

start_key



string



n/a



N



The first key in the inclusive KeyRange.

end_key



string



n/a



N



The last key in the inclusive KeyRange.

start_token



string



n/a



N



The first token in the exclusive KeyRange.

end_token



string



n/a



N



The last token in the exclusive KeyRange.

count



i32



100



Y



The total number of keys to permit in the KeyRange.


    Would it be useful (performance wise) to have a 'count' parameter
    too, or would it be useless as equivalent to end the scan loop
    application side, when the desired number of row is reached?



    Thanks


    TuX

Re: why does the scanner api have only startRow and stopRow and not also a count? was: Improving HBase scanner

Reply via email to