stoty commented on code in PR #6361: URL: https://github.com/apache/hbase/pull/6361#discussion_r1816016162
########## hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java: ########## @@ -69,13 +74,19 @@ public boolean filterRowKey(Cell firstRowCell) { if ((!isReversed() && cmp > 0) || (isReversed() && cmp < 0)) { passedPrefix = true; } - filterRow = (cmp != 0); + filterRow = (!isReversed() && cmp > 0) || (isReversed() && cmp < 0); + provideHint = (!isReversed() && cmp < 0) || (isReversed() && cmp > 0); Review Comment: In the reverse case I think that we should also check if the current cell is bigger than the hinted cell, and if not, then we should not reply with a hint. ########## hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilterList.java: ########## @@ -169,7 +169,7 @@ private void mpOneTest(Filter filterMPONE) throws Exception { assertTrue(filterMPONE.filterRowKey(KeyValueUtil.createFirstOnRow(rowkey))); kv = new KeyValue(rowkey, rowkey, Bytes.toBytes(0), Bytes.toBytes(0)); assertFalse(Filter.ReturnCode.INCLUDE == filterMPONE.filterCell(kv)); - assertFalse(filterMPONE.filterRow()); + assertTrue(filterMPONE.filterRow()); Review Comment: Can you explain this change ? We are calling filterCell for yyy, which is the prefix. Why would that behaviour change ? ########## hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java: ########## @@ -50,10 +53,12 @@ public byte[] getPrefix() { @Override public boolean filterRowKey(Cell firstRowCell) { - if (firstRowCell == null || this.prefix == null) return true; - if (filterAllRemaining()) return true; - int length = firstRowCell.getRowLength(); - if (length < prefix.length) return true; + if (firstRowCell == null || this.prefix == null) { + return true; + } + if (filterAllRemaining()) { + return true; + } Review Comment: While logically equvalent, they check for slightly different things, so I think it's better to keep them separate. ########## hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java: ########## @@ -142,6 +155,31 @@ boolean areSerializedFieldsEqual(Filter o) { return Bytes.equals(this.getPrefix(), other.getPrefix()); } + @Override + public Cell getNextCellHint(Cell cell) { + byte[] hintBytes; + if (reversed) { + // On reversed scan hint should be the prefix with last byte incremented + hintBytes = increaseLastNonMaxByte(this.prefix); + } else { + // On forward scan hint should be the prefix + hintBytes = prefix; + } + return PrivateCellUtil.createFirstOnRow(hintBytes, 0, (short) hintBytes.length); + } + + private byte[] increaseLastNonMaxByte(byte[] bytes) { + byte[] result = Arrays.copyOf(bytes, bytes.length); + for (int i = bytes.length - 1; i >= 0; i--) { + byte b = bytes[i]; + if (b < Byte.MAX_VALUE) { + result[i] = (byte) (b + 1); + break; + } + } + return result; + } Review Comment: That's OK. We can keep this here, or move it to PrivateCellUtil before commit. ########## hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java: ########## @@ -142,6 +155,31 @@ boolean areSerializedFieldsEqual(Filter o) { return Bytes.equals(this.getPrefix(), other.getPrefix()); } + @Override + public Cell getNextCellHint(Cell cell) { + byte[] hintBytes; + if (reversed) { + // On reversed scan hint should be the prefix with last byte incremented + hintBytes = increaseLastNonMaxByte(this.prefix); Review Comment: We should pre-compute this at creation, to avoid re-computing this several times in the corner case where there are a lot of cells between the hint and the first real match, ########## hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java: ########## @@ -69,13 +74,19 @@ public boolean filterRowKey(Cell firstRowCell) { if ((!isReversed() && cmp > 0) || (isReversed() && cmp < 0)) { passedPrefix = true; } - filterRow = (cmp != 0); + filterRow = (!isReversed() && cmp > 0) || (isReversed() && cmp < 0); + provideHint = (!isReversed() && cmp < 0) || (isReversed() && cmp > 0); Review Comment: Or rather just add a flag to make sure we only send a hint once, that would be faster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org