gsmiller commented on code in PR #13559: URL: https://github.com/apache/lucene/pull/13559#discussion_r1697210371
########## lucene/core/src/java/org/apache/lucene/util/BitSet.java: ########## @@ -92,6 +92,12 @@ public void clear() { */ public abstract int nextSetBit(int index); + /** + * Returns the index of the first set bit from start (inclusive) until end (inclusive). {@link Review Comment: I would tend to expect `end` to be exclusive in an API like this. (e.g., think of something like `String#substring(from, to)` in Java). WDYT about changing this? ########## lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java: ########## @@ -291,6 +291,32 @@ public int nextSetBit(int index) { return DocIdSetIterator.NO_MORE_DOCS; } + @Override + public int firstSetBitInRange(int start, int upperBound) { + // Depends on the ghost bits being clear! + assert start >= 0 && start < numBits : "index=" + start + ", numBits=" + numBits; + assert start <= upperBound : "index=" + start + ", upperBound=" + upperBound; + int i = start >> 6; + long word = bits[i] >> start; // skip all the bits to the right of index + + if (word != 0) { + int res = start + Long.numberOfTrailingZeros(word); + return res > upperBound ? DocIdSetIterator.NO_MORE_DOCS : res; + } + + int maxWord = Math.min((upperBound >> 6) + 1, numWords); Review Comment: nit: `maxWord` is slightly confusing naming IMO. I would have expected it to be an _inclusive_ max index but it's _exclusive_. The logic all makes sense to me, but maybe rename to something like `limit` (or change it to inclusive and keep the name)? ########## lucene/core/src/java/org/apache/lucene/util/SparseFixedBitSet.java: ########## @@ -353,14 +355,47 @@ public int nextSetBit(int i) { final long indexBits = index >>> i64 >>> 1; if (indexBits == 0) { // no more bits are set in the current block of 4096 bits, go to the next one - return firstDoc(i4096 + 1); + return firstDoc(i4096 + 1, indices.length); } // there are still set bits i64 += 1 + Long.numberOfTrailingZeros(indexBits); final long bits = bitArray[o]; return (i64 << 6) | Long.numberOfTrailingZeros(bits); } + @Override + public int firstSetBitInRange(int start, int upperBound) { + assert start < length; + assert upperBound >= start; + final int i4096 = start >>> 12; + final long index = indices[i4096]; + final long[] bitArray = this.bits[i4096]; + int i64 = start >>> 6; + int o = Long.bitCount(index & ((1L << i64) - 1)); Review Comment: nit: since you repeat `1L << i64` a couple times here, maybe introduce `final long i64bit = 1L << i64;` (like we do in `#get`) ########## lucene/join/src/java/org/apache/lucene/search/join/BlockJoinSelector.java: ########## @@ -64,14 +64,19 @@ public boolean get(int docID) { return false; } - final int firstChild = parents.prevSetBit(docID - 1) + 1; - for (int child = children.nextSetBit(firstChild); - child < docID; - child = children.nextSetBit(child + 1)) { - if (docsWithValue.get(child)) { - return true; + final int lastPotentialChild = docID - 1; + + final int firstPotentialChild = parents.prevSetBit(lastPotentialChild) + 1; + if (firstPotentialChild < docID) { + for (int child = children.firstSetBitInRange(firstPotentialChild, lastPotentialChild); + child != DocIdSetIterator.NO_MORE_DOCS; + child = children.firstSetBitInRange(child + 1, lastPotentialChild)) { + if (docsWithValue.get(child)) { + return true; + } Review Comment: nit: I would find this just slightly more readable, but I don't have a strong preference. Just a small suggestion. ```suggestion assert firstPotentialChild <= docID; if (firstPotentialChild == docID) { // no children return false; } for (int child = children.firstSetBitInRange(firstPotentialChild, lastPotentialChild); child != DocIdSetIterator.NO_MORE_DOCS; child = children.firstSetBitInRange(child + 1, lastPotentialChild)) { if (docsWithValue.get(child)) { return true; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org