Re: [PR] Use Vector API to decode BKD docIds [lucene]

2025-03-14 Thread via GitHub


jpountz commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2724038977

   I have some small concerns:
- The fact that the 512 step is tied to the number of points per leaf, 
though it's not a big deal at all, postings are similar: their encoding logic 
is specialized for blocks of 128. I guess I'd just rather err on a smaller 
block size than 512, which feels larg-ish.
- Complexity: the encoding has 3 different sub encodings: 512, 128 and 
remainder. Could we have only two?
   
   But my main concern is more that I would like to better understand why 512 
performs so much better. There must be something that happens with this 512 
step that doesn't happen otherwise such as using different instructions, loop 
unrolling, better CPU pipelining or something else. I have some discomfort 
merging something that is faster without having at least an intuition of why 
it's faster, so that I can also understand which JVMs and CPUs would enable 
this speedup. Could pipelining be the reason as 24 (bits per value) * 32 (step) 
< 2 * 512 (bit width of SIMD instructions)? But then something like 128 should 
perform well while your benchmark suggests it's still much worse than 512?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-14 Thread via GitHub


jpountz commented on PR #14333:
URL: https://github.com/apache/lucene/pull/14333#issuecomment-2724046501

   I started looking at the code but you would know better: does this new 
encoding make it easier to know the length of leaf blocks while traversing the 
terms index so that we could prefetch the right byte range when doing terms 
dictionary lookups? 
https://github.com/apache/lucene/blob/661dcae3c25cc548a6df251b79b7bfac81c2dba8/lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java#L147-L148


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


dweiss commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2724314718

   Or we can just embrace the fact that it can be a non-minimal NFA and justlet 
it run like that (with NFARunAutomaton).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Optimize ConcurrentMergeScheduler for Multi-Tenant Indexing [lucene]

2025-03-14 Thread via GitHub


DivyanshIITB commented on PR #14335:
URL: https://github.com/apache/lucene/pull/14335#issuecomment-2724394013

   Just a gentle reminder
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


dweiss commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2724062564

   I don't know Unicode as well as Rob so I can't say what these alternate case 
folding
   equivalence classes are... but they definitely don't have a "canonical" 
representation
   with regard to Character.toLowercase. Consider the killer Turkish dotless i, 
for example:
   
   ```
   public void testCornerCase() throws Exception {
   List terms = Stream.of(
   "aIb", "aıc")
   .map(s -> {
   int[] lowercased = 
s.codePoints().map(Character::toLowerCase).toArray();
   return new String(lowercased, 0, lowercased.length);
   })
   .map(LuceneTestCase::newBytesRef)
   .sorted()
   .collect(Collectors.toCollection(ArrayList::new));
   Automaton a = build(terms, false, true);
   System.out.println(a.toDot());
   assertTrue(a.isDeterministic());
   }
   ```
   which yields:
   
![image](https://github.com/user-attachments/assets/917d0c77-4ffc-4a9a-8565-8fddaff583af)
   
   It would take some kind of character normalization filter on both the index 
and automaton building/expansion side for this to work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-14 Thread via GitHub


gf2121 commented on code in PR #14333:
URL: https://github.com/apache/lucene/pull/14333#discussion_r1994867386


##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java:
##
@@ -0,0 +1,486 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.codecs.lucene90.blocktree;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Deque;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.ListIterator;
+import java.util.function.BiConsumer;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.store.IndexOutput;
+import org.apache.lucene.store.RandomAccessInput;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.BytesRefBuilder;
+
+/** TODO make it a more memory efficient structure */
+class Trie {
+
+  static final int SIGN_NO_CHILDREN = 0x00;
+  static final int SIGN_SINGLE_CHILDREN_WITH_OUTPUT = 0x01;
+  static final int SIGN_SINGLE_CHILDREN_WITHOUT_OUTPUT = 0x02;
+  static final int SIGN_MULTI_CHILDREN = 0x03;
+
+  record Output(long fp, boolean hasTerms, BytesRef floorData) {}
+
+  private enum Status {
+UNSAVED,
+SAVED,
+DESTROYED
+  }
+
+  private static class Node {
+private final int label;
+private final LinkedList children;
+private Output output;
+private long fp = -1;
+
+Node(int label, Output output, LinkedList children) {
+  this.label = label;
+  this.output = output;
+  this.children = children;
+}
+  }
+
+  private Status status = Status.UNSAVED;
+  final Node root = new Node(0, null, new LinkedList<>());
+
+  Trie(BytesRef k, Output v) {
+if (k.length == 0) {
+  root.output = v;
+  return;
+}
+Node parent = root;
+for (int i = 0; i < k.length; i++) {
+  int b = k.bytes[i + k.offset] & 0xFF;
+  Output output = i == k.length - 1 ? v : null;
+  Node node = new Node(b, output, new LinkedList<>());
+  parent.children.add(node);
+  parent = node;
+}
+  }
+
+  void putAll(Trie trie) {
+if (status != Status.UNSAVED || trie.status != Status.UNSAVED) {
+  throw new IllegalStateException("tries should be unsaved");
+}
+trie.status = Status.DESTROYED;
+putAll(this.root, trie.root);
+  }
+
+  private static void putAll(Node n, Node add) {
+assert n.label == add.label;
+if (add.output != null) {
+  n.output = add.output;
+}
+ListIterator iter = n.children.listIterator();
+// TODO we can do more efficient if there is no intersection, block tree 
always do that
+outer:
+for (Node addChild : add.children) {
+  while (iter.hasNext()) {
+Node nChild = iter.next();
+if (nChild.label == addChild.label) {
+  putAll(nChild, addChild);
+  continue outer;
+}
+if (nChild.label > addChild.label) {
+  iter.previous(); // move back
+  iter.add(addChild);
+  continue outer;
+}
+  }
+  iter.add(addChild);
+}
+  }
+
+  Output getEmptyOutput() {
+return root.output;
+  }
+
+  void forEach(BiConsumer consumer) {
+if (root.output != null) {
+  consumer.accept(new BytesRef(), root.output);
+}
+intersect(root.children, new BytesRefBuilder(), consumer);
+  }
+
+  private void intersect(
+  List nodes, BytesRefBuilder key, BiConsumer 
consumer) {
+for (Node node : nodes) {
+  key.append((byte) node.label);
+  if (node.output != null) consumer.accept(key.toBytesRef(), node.output);
+  intersect(node.children, key, consumer);
+  key.setLength(key.length() - 1);
+}
+  }
+
+  void save(DataOutput meta, IndexOutput index) throws IOException {
+if (status != Status.UNSAVED) {
+  throw new IllegalStateException("only unsaved trie can be saved");
+}
+status = Status.SAVED;
+meta.writeVLong(index.getFilePointer());
+saveNodes(index);
+meta.writeVLong(root.fp);
+index.writeLong(0L); // additional 8 bytes for over-reading
+meta.writeVLong(index.getFilePointer());
+  }
+
+  void saveNodes(IndexOutput index) throws IOException {
+final long startFP = index.getFilePointer();
+Deque stack = new ArrayDeque<>();
+stack.p

Re: [PR] Create vectorized versions of ScalarQuantizer.quantize and recalculateCorrectiveOffset [lucene]

2025-03-14 Thread via GitHub


thecoop commented on code in PR #14304:
URL: https://github.com/apache/lucene/pull/14304#discussion_r1987194449


##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -907,4 +907,87 @@ public static long int4BitDotProduct128(byte[] q, byte[] 
d) {
 }
 return subRet0 + (subRet1 << 1) + (subRet2 << 2) + (subRet3 << 3);
   }
+
+  @Override
+  public float quantize(
+  float[] vector, byte[] dest, float scale, float alpha, float 
minQuantile, float maxQuantile) {
+float correction = 0;
+int i = 0;
+// only vectorize if we have a viable BYTE_SPECIES we can use for output
+if (VECTOR_BITSIZE >= 256) {
+  for (; i < FLOAT_SPECIES.loopBound(vector.length); i += 
FLOAT_SPECIES.length()) {
+FloatVector v = FloatVector.fromArray(FLOAT_SPECIES, vector, i);
+
+// Make sure the value is within the quantile range, cutting off the 
tails
+// see first parenthesis in equation: byte = (float - minQuantile) * 
127/(maxQuantile -
+// minQuantile)
+FloatVector dxc = v.min(maxQuantile).max(minQuantile).sub(minQuantile);
+// Scale the value to the range [0, 127], this is our quantized value
+// scale = 127/(maxQuantile - minQuantile)
+// Math.round rounds to positive infinity, so do the same by +0.5 then 
truncating to int
+Vector roundedDxs = 
dxc.mul(scale).add(0.5f).convert(VectorOperators.F2I, 0);
+// output this to the array
+((ByteVector) roundedDxs.castShape(BYTE_SPECIES, 0)).intoArray(dest, 
i);
+// We multiply by `alpha` here to get the quantized value back into 
the original range
+// to aid in calculating the corrective offset
+Vector dxq = ((FloatVector) roundedDxs.castShape(FLOAT_SPECIES, 
0)).mul(alpha);
+// Calculate the corrective offset that needs to be applied to the 
score
+// in addition to the `byte * minQuantile * alpha` term in the equation
+// we add the `(dx - dxq) * dxq` term to account for the fact that the 
quantized value
+// will be rounded to the nearest whole number and lose some accuracy
+// Additionally, we account for the global correction of 
`minQuantile^2` in the equation
+correction +=
+v.sub(minQuantile / 2f)
+.mul(minQuantile)
+.add(v.sub(minQuantile).sub(dxq).mul(dxq))
+.reduceLanes(VectorOperators.ADD);

Review Comment:
   And even more with FMA operations



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speed up scoring conjunctions a bit. [lucene]

2025-03-14 Thread via GitHub


jpountz commented on PR #14345:
URL: https://github.com/apache/lucene/pull/14345#issuecomment-2724895262

   Nightly benchmarks confirmed the speedup: 
https://benchmarks.mikemccandless.com/FilteredAndHighHigh.html. I'll push an 
annotation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Optimize ConcurrentMergeScheduler for Multi-Tenant Indexing [lucene]

2025-03-14 Thread via GitHub


jpountz commented on PR #14335:
URL: https://github.com/apache/lucene/pull/14335#issuecomment-2724923272

   Apologies I had missed your reply.
   
   > should this be a shared global pool across all IndexWriters, or should 
each writer have its own pool?
   
   It should be shared, we don't want the total number of threads to scale with 
the number of index writers. The reasoning for the numProcessors/2 number is 
that merging generally should not be more expensive than indexing, so by 
reserving only half the CPU capacity for merging, it should still be possible 
to max out hardware while indexing, while also having a peak number of threads 
running merges under numProcessors/2.
   
   > If it's a shared pool, how should we handle cases where a few writers are 
highly active while others are idle? Should we allow active writers to take 
more resources dynamically, or keep a strict fixed allocation?
   
   Idle writers would naturally submit fewer tasks than highly active writers. 
IMO the fixed allocation is key here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Improve DenseConjunctionBulkScorer's sparse fallback. [lucene]

2025-03-14 Thread via GitHub


jpountz merged PR #14354:
URL: https://github.com/apache/lucene/pull/14354


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


dweiss commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2725496380

   Ok, fair enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] removing constructor with deprecated attribute 'onlyLongestMatch [lucene]

2025-03-14 Thread via GitHub


renatoh opened a new pull request, #14356:
URL: https://github.com/apache/lucene/pull/14356

   ### Description
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


rmuir commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2724585337

   > Or we can just embrace the fact that it can be a non-minimal NFA and 
justlet it run like that (with NFARunAutomaton).
   
   I don't think this is currently a good option either: users won't just do 
that. They will determinize, minimize, and tableize and then be confused when 
things are slow or use too much memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


rmuir commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2725736846

   It isn't a good idea. If the user wants to "erase case differences" then 
they should apply `foldcase(ch)`. That's what case-folding means. That 
CaseFolding class does everything, except, that. Again its why i recommend not 
messing with it for now and starting simpler.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


rmuir commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2724580292

   This is why i recommended to not use the unicode function and to start 
simple. Then you have a potential way to get it working efficiently.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


msfroh commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2725709282

   This is kind of what I had in mind:
   
   ```java
 private static int canonicalize(int codePoint) {
   int[] alternatives = CaseFolding.lookupAlternates(codePoint);
   if (alternatives != null) {
 for (int cp : alternatives) {
   codePoint = Math.min(codePoint, cp);
 }
   } else {
 int altCase = Character.isLowerCase(codePoint) ? 
Character.toUpperCase(codePoint) : Character.toLowerCase(codePoint);
 codePoint = Math.min(codePoint, altCase);
   }
   return codePoint;
 }
   
 public void testCornerCase() throws Exception {
   List terms = Stream.of(
   "aIb", "aıc")
   .map(s -> {
 int[] lowercased = 
s.codePoints().map(TestStringsToAutomaton::canonicalize).toArray();
 return new String(lowercased, 0, lowercased.length);
   })
   .map(LuceneTestCase::newBytesRef)
   .sorted()
   .collect(Collectors.toCollection(ArrayList::new));
   Automaton a = build(terms, false, true);
   System.out.println(a.toDot());
   assertTrue(a.isDeterministic());
 }
   ```
   
   That produces this automaton, which is minimal and deterministic:
   
![automaton](https://github.com/user-attachments/assets/ad4d6779-dafa-470d-aaf2-09292ab179b6)
   
   I don't know if that `canonicalize` method is a good idea, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] PointInSetQuery use reverse collection to improve performance [lucene]

2025-03-14 Thread via GitHub


hanbj commented on PR #14352:
URL: https://github.com/apache/lucene/pull/14352#issuecomment-2724230306

   Thank you for providing ideas. In scenarios with multiple dimensions, the 
internal nodes in the bkd tree can only be sorted according to a certain 
dimension. Different internal nodes may have different sorting dimensions, 
which is indeed difficult to implement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use Vector API to decode BKD docIds [lucene]

2025-03-14 Thread via GitHub


jpountz commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2726015514

   Thanks for running benchmarks. So it looks like the JVM doesn't think these 
shorter loops (with step 128) are worth unrolling? This makes me wonder how 
something like that  performs on your AVX-512 CPU. I think you had something 
similar in one of your previous iterations. On my machine it's on par with the 
current version.
   
   ```java
 private void readInts24(IndexInput in, int count, int[] docIDs) throws 
IOException {
   if (count == BKDConfig.DEFAULT_MAX_POINTS_IN_LEAF_NODE) {
 // Same format, but enabling the JVM to specialize the decoding logic 
for the default number
 // of points per node proved to help on benchmarks
 doReadInts24(in, 512, docIDs);
   } else {
 doReadInts24(in, count, docIDs);
   }
 }
   
 private void doReadInts24(IndexInput in, int count, int[] docIDs) throws 
IOException {
   // Read the first (count - count % 4) values
   int quarter = count >> 2;
   int numBytes = quarter * 3;
   in.readInts(scratch, 0, numBytes);
   for (int i = 0; i < numBytes; ++i) {
 docIDs[i] = scratch[i] >>> 8;
 scratch[i] &= 0xFF;
   }
   for (int i = 0; i < quarter; ++i) {
 docIDs[numBytes + i] = scratch[i]
| (scratch[quarter + i] << 8)
| (scratch[2 * quarter + i] << 16);
   }
   // Now read the remaining 0, 1, 2 or 3 values
   for (int i = quarter << 2; i < count; ++i) {
 docIDs[i] = (in.readShort() & 0x) | (in.readByte() & 0xFF) << 16;
   }
 }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use Vector API to decode BKD docIds [lucene]

2025-03-14 Thread via GitHub


gf2121 commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2725390772

   > There must be something that happens with this 512 step that doesn't 
happen otherwise such as using different instructions, loop unrolling, better 
CPU pipelining or something else.
   
   Thanks for pointing out this. I studied the asm profile again and i can see 
at least loop unrolling differs there. According to the asm printed by jmh, i 
can see for bpv24 decoding:
   
   * VectorAPI unrolled shift loop x8 (add 0x40 once) and remainder loop x4 
(add 0x20 once) 
   * InnerLoop 512 step unrolled shift loop x4 (add 0x20 once) and remainder 
loop x2 (add 0x10 once) 
   * InnerLoop 128 step does not get loop unrolling for either shift loop (add 
0x8 once) or remainder loop (add 0x8 once).
   
   This is corresponding to the result of jmh: vector API > InnerLoop step-512 
> InnerLoop step-128. 
   
   Things might change in luceneutil because we find InnerLoop step-512 faster 
than Vector API there. I confirmed the result of  luceneutil of 
step-512(baseline) vs step-128(candidate):
   
   ```
   TaskQPS baseline  StdDevQPS 
my_modified_version  StdDevPct diff p-value
 FilteredIntNRQ   80.02  (4.0%)   71.31  
(3.0%)  -10.9% ( -17% -   -4%) 0.000
 IntNRQ   80.94  (2.5%)   72.60  
(3.6%)  -10.3% ( -16% -   -4%) 0.000
CountFilteredIntNRQ   42.93  (2.9%)   40.22  
(2.3%)   -6.3% ( -11% -   -1%) 0.001
 IntSet   93.36  (2.1%)   93.85  
(0.7%)0.5% (  -2% -3%) 0.633
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-03-14 Thread via GitHub


msfroh commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2726097192

   Hmm... I'm thinking of just requiring that input is lowercase (per 
`Character.lowerCase(c)`), then check for collisions on uppercase versions when 
adding transitions, and throw an exception (since it won't be a DFA).
   
   Unfortunately, that would mess with Turkish, if someone tries searching for 
sınıf (class) and sinirli (nervous). Without locale info, we'd get two 
transitions from s to I.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org