Re: [PR] Remove patching for doc blocks. [lucene]

2023-11-01 Thread via GitHub


gf2121 commented on code in PR #12741:
URL: https://github.com/apache/lucene/pull/12741#discussion_r1378509674


##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99PostingsFormat.java:
##
@@ -0,0 +1,518 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.codecs.lucene99;
+
+import java.io.IOException;
+import org.apache.lucene.codecs.BlockTermState;
+import org.apache.lucene.codecs.CodecUtil;
+import org.apache.lucene.codecs.FieldsConsumer;
+import org.apache.lucene.codecs.FieldsProducer;
+import org.apache.lucene.codecs.MultiLevelSkipListWriter;
+import org.apache.lucene.codecs.PostingsFormat;
+import org.apache.lucene.codecs.PostingsReaderBase;
+import org.apache.lucene.codecs.PostingsWriterBase;
+import 
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsReader;
+import 
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter;
+import org.apache.lucene.index.IndexOptions;
+import org.apache.lucene.index.SegmentReadState;
+import org.apache.lucene.index.SegmentWriteState;
+import org.apache.lucene.index.TermState;
+import org.apache.lucene.store.DataOutput;
+import org.apache.lucene.util.IOUtils;
+import org.apache.lucene.util.packed.PackedInts;
+
+/**
+ * Lucene 5.0 postings format, which encodes postings in packed integer blocks 
for fast decode.

Review Comment:
   ```suggestion
* Lucene 9.9 postings format, which encodes postings in packed integer 
blocks for fast decode.
   ```



##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene90/Lucene90PostingsFormat.java:
##
@@ -38,6 +37,8 @@
 /**
  * Lucene 5.0 postings format, which encodes postings in packed integer blocks 
for fast decode.

Review Comment:
   ```suggestion
* Lucene 9.0 postings format, which encodes postings in packed integer 
blocks for fast decode.
   ```



##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99SkipReader.java:
##
@@ -0,0 +1,206 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.codecs.lucene99;
+
+import java.io.IOException;
+import java.util.Arrays;
+import org.apache.lucene.codecs.MultiLevelSkipListReader;
+import org.apache.lucene.store.IndexInput;
+
+/**
+ * Implements the skip list reader for block postings format that stores 
positions and payloads.
+ *
+ * Although this skipper uses MultiLevelSkipListReader as an interface, its 
definition of skip
+ * position will be a little different.
+ *
+ * For example, when skipInterval = blockSize = 3, df = 2*skipInterval = 6,
+ *
+ * 
+ * 0 1 2 3 4 5
+ * d d d d d d(posting list)
+ * ^ ^(skip point in MultiLeveSkipWriter)
+ *   ^(skip point in Lucene90SkipWriter)
+ * 
+ *
+ * In this case, MultiLevelSkipListReader will use the last document as a 
skip point, while
+ * Lucene90SkipReader should assume no skip point will comes.
+ *
+ * If we use the interface directly in Lucene90SkipReader, it may silly try 
to read another skip
+ * data after the only skip point is loaded.
+ *
+ * To illustrate this, we can call skipTo(d[5]), since skip point d[3] has 
smaller docId, and
+ * numSkipped+blockSize== df, the MultiLevelSkipListReader will assume the 
skip list isn't exhausted
+ * yet, and try to load a non-existed skip point
+ *
+ * Therefore, we'll trim df before passing it to the interface. see 
trim(int)
+ */
+public class Lucene99SkipReader ext

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


gf2121 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378571480


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -110,25 +110,34 @@ public long add(FSTCompiler.UnCompiledNode nodeIn) 
throws IOException {
 node = getFallback(nodeIn, hash);
 if (node != 0) {
   // it was already in fallback -- promote to primary
-  primaryTable.set(pos, node);
+  primaryTable.set(pos, node, fallbackTable.getBytes(node));
 } else {
   // not in fallback either -- freeze & add the incoming node
 
   // freeze & add
-  node = fstCompiler.addNode(nodeIn);
+  FSTCompiler.NodeAndBuffer nodeAndBuffer = 
fstCompiler.addNode(nodeIn, true);

Review Comment:
   Can we use logic like below to avoid changing `FSTCompile#addNode`?
   ```
   long startAddress = fstCompiler.bytes.getPosition();
   node = fstCompiler.addNode(nodeIn);
   byte[] bytes = null;
   if (node != FST.FINAL_END_NODE && node != FST.NON_FINAL_END_NODE) {
 bytes = new byte[Math.toIntExact(node - startAddress + 1)];
 fstCompiler.bytes.copyBytes(startAddress, bytes, 0, bytes.length);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


gf2121 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378573421


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +193,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;

Review Comment:
   Can we use a data structure like `BytesRefArray` to be more memory efficient?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378577965


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -110,25 +110,34 @@ public long add(FSTCompiler.UnCompiledNode nodeIn) 
throws IOException {
 node = getFallback(nodeIn, hash);
 if (node != 0) {
   // it was already in fallback -- promote to primary
-  primaryTable.set(pos, node);
+  primaryTable.set(pos, node, fallbackTable.getBytes(node));
 } else {
   // not in fallback either -- freeze & add the incoming node
 
   // freeze & add
-  node = fstCompiler.addNode(nodeIn);
+  FSTCompiler.NodeAndBuffer nodeAndBuffer = 
fstCompiler.addNode(nodeIn, true);

Review Comment:
   I was thinking about this too. I can go with this now, just that this means 
it assume the BytesStore still hold those bytes after adding nodes. It kinda 
conflict with the idea of flushing the BytesStore after adding nodes that I had 
in https://github.com/apache/lucene/pull/12624
   
   But I guess we could flush the BytesStore in the beginning of the method 
instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


mikemccand commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378599798


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -214,7 +222,13 @@ private long hash(long node) throws IOException {
* Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
* returning true if they are equal.
*/
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
+  private boolean nodesEqual(
+  PagedGrowableHash table, FSTCompiler.UnCompiledNode node, long 
address)
+  throws IOException {
+// nocommit: this is non-optimal, we should have a BytesReader that wraps 
and read the
+// ByteBlockPool directly
+byte[] bytes = table.getBytes(address);

Review Comment:
   Ahhh that's right, the hash map must retain the true FST offset since that's 
what future added nodes must link to!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378601593


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +193,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;

Review Comment:
   I was trying to avoid double byte copy thus I used the list. I looked at 
`BytesRefArray` and it seems it would still have the double byte copy problem 
like `ByteBlockPool`:
   - When we call `append(BytesRef)`, the byte[] will be copied into the 
`ByteBlockPool`. We already had 1 copy when (1) reading from the BytesStore on 
adding node and (2) reading from the fallback table on promotion.
   - Similarly calling `get(BytesRef, int)` would also require a copy from the 
`ByteBlockPool` into a temporary `byte[]`
   - Using `List`, on promotion we can also just share the node `byte[]`, no 
copy here and memory is shared, thus I think it would be more efficient on both 
CPU and memory footprint?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


mikemccand commented on PR #12738:
URL: https://github.com/apache/lucene/pull/12738#issuecomment-1788744551

   > But I ended up using a `List` where each item is a node instead of 
ByteBlockPool due to the following reasons:
   
   Hmm -- this is sizable added RAM overhead per entry.  Added array header (16 
or 20 bytes), added pointers to these arrays (4 or 8 bytes), tiny objects for 
GC to crawl, though maybe they mostly die young if the RAM budget is low 
enough.  Yes, we can share some of these `byte[]` with both hashes but I 
suspect that's not a net/net win.
   
   > We automatically get the length of the node without any traversing
   
   We are paying 4 bytes per entry for this :)  And it seems a shame since 
FST's node encoding is already self-delimiting, and, as a side effect of 
looking up that node we will already have computed that length.
   
   > With ByteBlockPool we have 2 unavoidable double byte-copies: (1) when 
write from BytesStore to the primary table and (2) when promote an entry from 
the fallback table to the primary table. In both situations we need to first 
write into a temporary byte[].
   
   I don't think this added cost is so much.  We freeze the node into the true 
FST appending byte store, then, we copy those "last N bytes" over to primary 
hash.  Eventually it's moved to fallback, and, maybe it never gets promoted 
back (single copy), or, maybe it does (+1 copy) but that's "worth it" since we 
achieve some minimization, i.e. the cost correlates nicely with the win (the 
whole point of this double LRU hash).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on PR #12738:
URL: https://github.com/apache/lucene/pull/12738#issuecomment-1788758864

   > Eventually it's moved to fallback, and, maybe it never gets promoted back 
(single copy), or, maybe it does (+1 copy)
   
   There is actually already one copy before this, which is where we read from 
the BytesStore into the temporary byte[]. So in case it never got promoted it's 
2 copy and when it is, it is 4 copy (one to read from the fallback table into a 
temporary byte[] and one to write to the primary table).
   
   When it got promoted, we also had one side effect that the byte can be 
shared between the primary and fallback.
   
   I see there is a tradeoff here. If we don't care too much about CPU then we 
can use BytesRefArray like @gf2121 suggested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


mikemccand commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378650643


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +197,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;
 
 // 256K blocks, but note that the final block is sized only as needed so 
it won't use the full
 // block size when just a few elements were written to it
 private static final int BLOCK_SIZE_BYTES = 1 << 18;
 
 public PagedGrowableHash() {
   entries = new PagedGrowableWriter(16, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
+  copiedOffsets = new PagedGrowableWriter(32, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
   mask = 15;
+  offsetMask = 31;
+  copiedNodes = new ArrayList<>();
 }
 
 public PagedGrowableHash(long lastNodeAddress, long size) {
   entries =
   new PagedGrowableWriter(
   size, BLOCK_SIZE_BYTES, 
PackedInts.bitsRequired(lastNodeAddress), PackedInts.COMPACT);
+  copiedOffsets =
+  new PagedGrowableWriter(
+  size * 2,
+  BLOCK_SIZE_BYTES,
+  PackedInts.bitsRequired(lastNodeAddress),
+  PackedInts.COMPACT);
   mask = size - 1;
+  offsetMask = size * 2 - 1;

Review Comment:
   Hmm -- why are `entries` and `copiedOffsets` not simply side-by-side values 
arrays for the hash map?  Why is `copiedOffsets` 2X the size?  It seems like we 
could have them precisely match (side by side value array

Re: [PR] Remove patching for doc blocks. [lucene]

2023-11-01 Thread via GitHub


slow-J commented on PR #12741:
URL: https://github.com/apache/lucene/pull/12741#issuecomment-1788786005

   > Thanks @slow-J ! I left some minor comments about additional `90` -> `99` 
refactoring.
   
   Thanks @gf2121 , committed all the suggestions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


gf2121 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378670256


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +193,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;

Review Comment:
   For reference, you may find great chart that showing the GC difference 
between `many tiny byte[]` and `big byte[] with offsets` 
[here](https://github.com/elastic/elasticsearch/pull/21776#issuecomment-262743775)
 :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Should we explore DiskANN for aKNN vector search? [lucene]

2023-11-01 Thread via GitHub


benwtrent commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1788822945

   So, I replicated the jvector benchmark (the lucene part) using the new int8 
quantization. 
   
   Note, this is with `0` fan out or extra top-k gathered. Since the benchmark 
on JVector didn't specify any recall, etc. I just did the absolute baseline of 
`top-100`.
   
   I reserved 12GB to heap, thus reducing off-heap memory to at most 30GB.
   
   ```
   1 thread over 37 segments:
   completed 1000 searches in 18411 ms: 54 QPS CPU time=18231ms
   checking results
   0.77718.23   1   0   16  100 100 0   
1.00post-filter
   ```
   
   Since kNN allows the segments to be searched in parallel, I used 8 threads 
for the query rewrite
   ```
   8 threads over 37 segments:
   completed 1000 searches in 2996 ms: 333 QPS CPU time=218ms
   checking results
   0.777 0.22   1   0   16  100 100 0   
1.00post-filter
   ```
   
   I am currently force-merging to a single segment to see what a single graph 
gives us. 
   
   FYI: the data set would require > 40GB of ram to be held in memory. With 
int8 quantization, its down to around 10GB.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378703915


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +197,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;
 
 // 256K blocks, but note that the final block is sized only as needed so 
it won't use the full
 // block size when just a few elements were written to it
 private static final int BLOCK_SIZE_BYTES = 1 << 18;
 
 public PagedGrowableHash() {
   entries = new PagedGrowableWriter(16, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
+  copiedOffsets = new PagedGrowableWriter(32, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
   mask = 15;
+  offsetMask = 31;
+  copiedNodes = new ArrayList<>();
 }
 
 public PagedGrowableHash(long lastNodeAddress, long size) {
   entries =
   new PagedGrowableWriter(
   size, BLOCK_SIZE_BYTES, 
PackedInts.bitsRequired(lastNodeAddress), PackedInts.COMPACT);
+  copiedOffsets =
+  new PagedGrowableWriter(
+  size * 2,
+  BLOCK_SIZE_BYTES,
+  PackedInts.bitsRequired(lastNodeAddress),
+  PackedInts.COMPACT);
   mask = size - 1;
+  offsetMask = size * 2 - 1;

Review Comment:
   The index of `entries` are the hash of the node arcs while the index of 
`copiedOffsets` are the hash of the address hence their positions are not 
matched.



-- 
This is an automated message from the Apache Git Serv

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378704913


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,119 +197,85 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
+public long copiedBytes;
 private PagedGrowableWriter entries;
+// mapping from FST real address to copiedNodes offsets
+private PagedGrowableWriter copiedOffsets;
 private long count;
+// mask for entries
 private long mask;
+// mask for copiedOffsets
+private long offsetMask;
+private final List copiedNodes;
 
 // 256K blocks, but note that the final block is sized only as needed so 
it won't use the full
 // block size when just a few elements were written to it
 private static final int BLOCK_SIZE_BYTES = 1 << 18;
 
 public PagedGrowableHash() {
   entries = new PagedGrowableWriter(16, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
+  copiedOffsets = new PagedGrowableWriter(32, BLOCK_SIZE_BYTES, 8, 
PackedInts.COMPACT);
   mask = 15;
+  offsetMask = 31;
+  copiedNodes = new ArrayList<>();
 }
 
 public PagedGrowableHash(long lastNodeAddress, long size) {
   entries =
   new PagedGrowableWriter(
   size, BLOCK_SIZE_BYTES, 
PackedInts.bitsRequired(lastNodeAddress), PackedInts.COMPACT);
+  copiedOffsets =
+  new PagedGrowableWriter(
+  size * 2,
+  BLOCK_SIZE_BYTES,
+  PackedInts.bitsRequired(lastNodeAddress),
+  PackedInts.COMPACT);
   mask = size - 1;
+  offsetMask = size * 2 - 1;
   assert (mask & size) == 0 : "size must be a power-of-2; got size=" + 
size + " mask=" + mask;
+  copiedNodes = new ArrayList<>();
+}
+
+public byte[] getBytes(long node) {
+  long pos = offsetHash(node) & offsetMas

Re: [I] Would SIMD powered sort (on top of Panama) be worth it? [lucene]

2023-11-01 Thread via GitHub


mikemccand commented on issue #12399:
URL: https://github.com/apache/lucene/issues/12399#issuecomment-1788845861

   Some exciting updates here ... Intel continues to improve this "super fast 
sorting using SIMD instructions" library: 
https://www.phoronix.com/news/Intel-x86-simd-sort-4.0
   
   And apparently NumPy (fast Python numeric extensions) adopted it, and now 
also OpenJDK: https://www.phoronix.com/news/Intel-x86-simd-sort-3.0
   
   I wonder which version of OpenJDK we will first see this released in?  So, 
yeah, we should try to use `Arrays.sort` more in Lucene instead of our own sort 
impls when possible!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


ChrisHegarty commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-179191

   I see reasonable speedups on both x64 and ARM, but sadly see no 
vectorization in the disassembly.  The speed seems to come from the 2x 
instructions/pipelining (?) per strip mined loop iteration, rather than any 
vectorization (which is not happening).  [ I included the Panama/Vector output 
just for comparison purposes ]
   
   Mac M2:
   
   main:
   ```
   VectorUtilBenchmark.binaryDotProductScalar1024  thrpt5  3.151 ± 
0.044  ops/us
   VectorUtilBenchmark.binaryDotProductVector1024  thrpt5  7.112 ± 
0.030  ops/us
   ```
   
   PR branch:
   ```
   VectorUtilBenchmark.binaryDotProductScalar1024  thrpt5  4.722 ± 
0.031  ops/us
   VectorUtilBenchmark.binaryDotProductVector1024  thrpt5  7.127 ± 
0.042  ops/us
   ```
   
   I see a reasonable speed up on Linux / x64. 
   
   main:
   ```
   VectorUtilBenchmark.binaryDotProductScalar1024  thrpt5   2.121 ± 
0.010  ops/us
   VectorUtilBenchmark.binaryDotProductVector1024  thrpt5  21.341 ± 
0.039  ops/us
   ```
   
   PR branch:
   ```
   VectorUtilBenchmark.binaryDotProductScalar1024  thrpt5   2.799 ± 
0.022  ops/us
   VectorUtilBenchmark.binaryDotProductVector1024  thrpt5  21.337 ± 
0.014  ops/us
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-1788910222

   thank you for the explanation, so now it just adds another mystery for me.  
Your x64 is still i5-11400 ? I will look into this more.
   
   Yes on the ARM, the assembly didn't look like what i wanted but inputs are 
`byte` and not `short` which might be relevant. But performance improved, so 
that means we should investigate wtf is happening and try to take it to the 
limit :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-1788919960

   and what is the compiler's excuse this time? :) no floating point here. I 
should just be able to write a simple loop and get good performance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


mikemccand commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378809307


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -107,28 +121,43 @@ public long add(FSTCompiler.UnCompiledNode nodeIn) 
throws IOException {
   long node = primaryTable.get(pos);
   if (node == 0) {
 // node is not in primary table; is it in fallback table?
-node = getFallback(nodeIn, hash);
-if (node != 0) {
+NodeAddressAndLength addressAndLength = getFallback(nodeIn, hash);
+if (addressAndLength != null) {
+  node = addressAndLength.address;
   // it was already in fallback -- promote to primary
-  primaryTable.set(pos, node);
+  // TODO: Copy directly between 2 ByteBlockPool to avoid double-copy
+  primaryTable.set(pos, node, fallbackTable.getBytes(node, 
addressAndLength.length));
 } else {
   // not in fallback either -- freeze & add the incoming node
 
+  long startAddress = fstCompiler.bytes.getPosition();
   // freeze & add
   node = fstCompiler.addNode(nodeIn);
 
   // we use 0 as empty marker in hash table, so it better be 
impossible to get a frozen node
   // at 0:
-  assert node != 0;
+  assert node != FST.FINAL_END_NODE && node != FST.NON_FINAL_END_NODE;
+  byte[] buf = new byte[Math.toIntExact(node - startAddress + 1)];

Review Comment:
   Maybe add `// TODO` to change this to pool-to-pool copy to save the extra 
copy?  But we don't need to do this now ... low priority opto.



##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,132 +214,99 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
-private PagedGrowableWriter entries;
+public long copiedBytes;
+// storing t

Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


ChrisHegarty commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-1789007292

   > Your x64 is still i5-11400 ? I will look into this more.
   
   Yes.   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


ChrisHegarty commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-1789014829

   jmh prof output from my Linux x64 rocket lake - 
https://gist.github.com/ChrisHegarty/508bb1857cb50df0d757f711c81fd740


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378848941


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -107,28 +121,43 @@ public long add(FSTCompiler.UnCompiledNode nodeIn) 
throws IOException {
   long node = primaryTable.get(pos);
   if (node == 0) {
 // node is not in primary table; is it in fallback table?
-node = getFallback(nodeIn, hash);
-if (node != 0) {
+NodeAddressAndLength addressAndLength = getFallback(nodeIn, hash);
+if (addressAndLength != null) {
+  node = addressAndLength.address;
   // it was already in fallback -- promote to primary
-  primaryTable.set(pos, node);
+  // TODO: Copy directly between 2 ByteBlockPool to avoid double-copy
+  primaryTable.set(pos, node, fallbackTable.getBytes(node, 
addressAndLength.length));
 } else {
   // not in fallback either -- freeze & add the incoming node
 
+  long startAddress = fstCompiler.bytes.getPosition();
   // freeze & add
   node = fstCompiler.addNode(nodeIn);
 
   // we use 0 as empty marker in hash table, so it better be 
impossible to get a frozen node
   // at 0:
-  assert node != 0;
+  assert node != FST.FINAL_END_NODE && node != FST.NON_FINAL_END_NODE;
+  byte[] buf = new byte[Math.toIntExact(node - startAddress + 1)];

Review Comment:
   I have a TODO in `set()` itself, which covers the case of writing from 
BytesStore. But I could move to here instead.



##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,132 +214,99 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
-private PagedGrowableWriter entries;
+public long copiedBytes;
+// storing the FST node address wh

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378841828


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -328,7 +323,128 @@ private void rehash(long lastNodeAddress) throws 
IOException {
   }
 
   mask = newMask;
-  entries = newEntries;
+  fstHashAddress = newEntries;
+
+  PagedGrowableWriter newCopiedOffsets =
+  new PagedGrowableWriter(
+  newSize, BLOCK_SIZE_BYTES, PackedInts.bitsRequired(copiedBytes), 
PackedInts.COMPACT);
+  PagedGrowableWriter newFSTOffsets =
+  new PagedGrowableWriter(
+  newSize,
+  BLOCK_SIZE_BYTES,
+  PackedInts.bitsRequired(lastNodeAddress),
+  PackedInts.COMPACT);
+  for (long idx = 0; idx < fstNodeAddress.size(); idx++) {
+long address = fstNodeAddress.get(idx);
+if (address != 0) {
+  long pos = Long.hashCode(address) & newMask;
+  while (true) {
+if (newFSTOffsets.get(pos) == 0) {
+  newFSTOffsets.set(pos, address);
+  newCopiedOffsets.set(pos, copiedNodeAddress.get(idx));
+  break;
+}
+
+pos = (pos + 1) & newMask;
+  }
+}
+  }
+
+  fstNodeAddress = newFSTOffsets;
+  copiedNodeAddress = newCopiedOffsets;
+}
+
+// hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
+// node!
+private long hash(long node) throws IOException {
+  FST.BytesReader in = getBytesReader(node);
+
+  final int PRIME = 31;
+
+  long h = 0;
+  fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
+  while (true) {
+h = PRIME * h + scratchArc.label();
+h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
+h = PRIME * h + scratchArc.output().hashCode();
+h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
+if (scratchArc.isFinal()) {
+  h += 17;
+}
+if (scratchArc.isLast()) {
+  break;
+}
+fstCompiler.fst.readNextRealArc(scratchArc, in);
+  }
+
+  return h;
+}
+
+/**
+ * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address
+ * (long), returning the local copiedNodes start address if the two nodes 
are matched, or -1
+ * otherwise
+ */
+private int getMatchedNodeLength(FSTCompiler.UnCompiledNode node, long 
address)

Review Comment:
   This is actually renamed from `nodesEqual` (it was removed), so there is no 
duplication. The old behavior is essentially `getMatchedNodeLength != -1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378882326


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,132 +214,99 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
-private PagedGrowableWriter entries;
+public long copiedBytes;
+// storing the FST node address where the position is the masked hash of 
the node arcs

Review Comment:
   > I was thinking we could do this instead with two PagedGrowableWriter, both 
of which being addressed by the hash(fstNodeAddress) % mask?
   
   It's a bit messy as that means we need to know the masked hash of the node 
when computing the hash. So I put that masked hash (computed using the unfrozen 
nodes) into the hash function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] Should we not enlarge PagedGrowableWriter initial bitPerValue on NodeHash.rehash() [lucene]

2023-11-01 Thread via GitHub


dungba88 opened a new issue, #12744:
URL: https://github.com/apache/lucene/issues/12744

   ### Description
   
   Spawn from https://github.com/apache/lucene/pull/12738
   
   It seems on 
[rehash](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java#L310C1-L310C56)
 we will use an initial bitPerValue set to cover the last node address.
   
   Does that mean every values, including the ones with low-address will use 
the same bpv as the high-address nodes? PagedGrowableWriter already enlarge the 
bpv 
[automatically](https://github.com/apache/lucene/blob/8fa0de2743e87dd264619632978c8dc68323947a/lucene/core/src/java/org/apache/lucene/util/packed/GrowableWriter.java#L86),
 so maybe we could always set the initial bpv to a small value (like 8?)
   
   Let see if the total size is different when making the change
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Use max BPV encoding in postings if doc buffer size less than ForUtil.BLOCK_SIZE [lucene]

2023-11-01 Thread via GitHub


easyice commented on issue #12717:
URL: https://github.com/apache/lucene/issues/12717#issuecomment-1789106681

   Oh.. Group-varint is a interesting encoder, I'd love to try it later in 
the week


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


benwtrent commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1789218896

   Something weird is happening. This commit is causing the following failure:
   ```
   ./gradlew test --tests 
TestUnifiedHighlighterTermIntervals.testCustomFieldValueSource 
-Dtests.seed=9E4091CE5D400BA3
   ```
   
   I am starting to investigate, but folks involved with this change might find 
the issue quicker.
   
   ```
 > java.lang.NoSuchMethodError: 
'org.apache.lucene.util.automaton.Automaton 
org.apache.lucene.util.automaton.Automata.makeStringUnion(java.util.Collection)'
  > at 
org.apache.lucene.search.uhighlight.CharArrayMatcher.fromTerms(CharArrayMatcher.java:42)
  > at 
org.apache.lucene.search.uhighlight.MemoryIndexOffsetStrategy.buildCombinedAutomaton(MemoryIndexOffsetStrategy.java:71)
  > at 
org.apache.lucene.search.uhighlight.MemoryIndexOffsetStrategy.(MemoryIndexOffsetStrategy.java:53)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.getOffsetStrategy(UnifiedHighlighter.java:1238)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.getFieldHighlighter(UnifiedHighlighter.java:1072)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFieldsAsObjects(UnifiedHighlighter.java:880)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:814)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:792)
  > at 
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:725)
  > at 
org.apache.lucene.search.uhighlight.TestUnifiedHighlighterTermIntervals.testCustomFieldValueSource(TestUnifiedHighlighterTermIntervals.java:607)
  > at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  > at java.base/java.lang.reflect.Method.invoke(Method.java:580)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  > at 
junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  > at 
org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  > at 
randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.e

Re: [PR] speedup arm int functions? [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12743:
URL: https://github.com/apache/lucene/pull/12743#issuecomment-1789229323

   thanks @ChrisHegarty ! I will try to dig in more to this. Again i don't see 
speedups on my x86 so something has changed (maybe architecture) thats allowing 
more to happen in parallel on your CPU or something. wikichip has been down so 
I don't know any answers yet. nor if it can apply to the vector case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


gsmiller commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1789264349

   @benwtrent is this an issue of the code only being partially rebuilt? The 
signature for `Automata#makeStringUnion` was changed to accept a more general 
`Iterable` as opposed to a `Collection` in this change. Let me see if I can 
reproduce it locally as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


benwtrent commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1789266991

   @gsmiller you are 100% correct 🤦 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


gsmiller commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1789348091

   @benwtrent glad it was a simple fix. Sorry it created churn!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] byte to int in TruncateTokenFilterFactory to TruncateTokenFilter [lucene]

2023-11-01 Thread via GitHub


asubbu90 commented on issue #12449:
URL: https://github.com/apache/lucene/issues/12449#issuecomment-1789370514

   Hi @robro612 , you can see I have already opened a PR #12507 on this issue. 
Do you want to have more context on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-01 Thread via GitHub


uschindler commented on PR #12737:
URL: https://github.com/apache/lucene/pull/12737#issuecomment-1789483634

   Hi @rmuir,
   I am fine with both approaches. Let me just create a new PR with some 
changes for the Constants.java class and a separate pkg private utility class 
for looking up JVM args. This would allow us to handle and log a warning if for 
some reason the jdk.management module is not available. RAMUsageEstimator has 
that code and I will refactor it to separate class.
   
   I will also move the current lookup of Client VM via sysprop to 
Constants.java (a constant like `IS_CLIENT_VM`), so we can refer to it at 
runtime to figure out if JVM has C2 available. The guards like security 
exception and all stuf will be handled there, too. The checks in Panama and in 
this PR can then all be done centralized, so you do not need the crazy 
reflective code. I plan to add a `static Optional getVMOption(String 
name)` which returns an empty optional if it does not exist or module system / 
security manager WTF hides it.
   
   After that the logic for the Constants.java can be used here and this PR 
gets easier. I agree to merge it afterwards, after adding the C2 check.
   
   All fine then? I don't want to argue here about pros and cons of enabling 
FMA by default. My standpoint is just different. We have a lot of code in 
Lucene which may run slow if you choose wrong garbage collector or buggy Java 
versions or Htospot/CPU combinations. What happens with our scalar code or PFOR 
if you run it on a Core i3 of 2012 ?!? We can't prevent all slowdowns. If 
people see slowdowns on AMD or ARM, they have the "easy option" to diable it by 
passing Hotspot arguments. This can easily be stated in documentation of Solr 
or Elasticsearch. People that care about speed need to tune anyways.
   
   My problem here is that you add another layer of optimization settings that 
conflict with the knobs the enduser already has. If you disable FMA by default 
on ARM or AMD there must be the option to turn it on (e.g., sysprop).
   
   About the benchmarks: There is a sligt slowdown and a slight speedup on my 
opteron. The stddev is not too large, but yes it could be both directions. 
Unfortunately I acnnot easily test it, because you blocked in this PR for me 
the option to forcefully enable FMA. And that's my big problem. That's not 
future proof? Why do you want to block people using Lucene 9.9 to still use it 
in 2 years where maybe all AMD cpus of modern generation are fast?
   
   It is not Lucene's responsibility to question defaults set by the JDK. If 
FMA is so bad (as you aregure on some CPUs, why does Hotspot not disable it by 
default? Please let the option open to the user. If Elasticserach is afraid of 
slowing down their users, they can add `+XX:-UseFMA` to their standard 
jvm-options.properties file shipped with their distribution. This allows users 
to turn on/off.
   
   I agree that we should not use Panama or FMA if C2  is completely disabled. 
But for everything else we have correct knobs. This is why I strongly -1 to 
disable FMA on AMD (and also ARM) by default without the possibility to tun is 
back on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


dweiss commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1789519097

   See #12742 - it's a problem with out javac task settings. Running a clean is 
a workaround. I will provide a fix for this when I get a chance, it's not a 
critical issue (but it is a bug in how we currently set up task dependencies).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Fix javac task inputs so that they include modular dependencies #12742 [lucene]

2023-11-01 Thread via GitHub


dweiss opened a new pull request, #12745:
URL: https://github.com/apache/lucene/pull/12745

   Fixes #12742.,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] JavaCompile tasks may be in up-to-date state when modular dependencies have changed leading to odd runtime errors [lucene]

2023-11-01 Thread via GitHub


dweiss commented on issue #12742:
URL: https://github.com/apache/lucene/issues/12742#issuecomment-1789587275

   I've provided a PR that fixes this. It is a corner case of us providing 
custom javac parameters (modular classpath). I am surprised this took so long 
to be discovered and I apologize - it's an oversight - we did have a dependsOn 
clause but it's different than task inputs (dependsOn is more of a task graph 
scheduling hint, inputs is a task _status_ hint).
   
   I've checked the PR by running Hoss's repro:
   ```
   git co 11436a848cbcc8302b31ca01e63e57dd35a93b1e
   ./gradlew clean classes testClasses
   git co [head of PR branch]
   ./gradlew test --tests TestFlattenGraphFilter
   ```
   everything works and the code is recompiled as it should now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Adding new flat vector format and refactoring HNSW [lucene]

2023-11-01 Thread via GitHub


jpountz commented on code in PR #12729:
URL: https://github.com/apache/lucene/pull/12729#discussion_r1379282330


##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java:
##
@@ -399,41 +281,30 @@ private HnswGraph getGraph(FieldEntry entry) throws 
IOException {
 
   @Override
   public void close() throws IOException {
-IOUtils.close(vectorData, vectorIndex, quantizedVectorData);
+IOUtils.close(flatVectorsReader, vectorIndex);
   }
 
   @Override
-  public OffHeapQuantizedByteVectorValues getQuantizedVectorValues(String 
field)
-  throws IOException {
-FieldEntry fieldEntry = fields.get(field);
-if (fieldEntry == null || fieldEntry.hasQuantizedVectors() == false) {
-  return null;
+  public QuantizedByteVectorValues getQuantizedVectorValues(String field) 
throws IOException {
+if (flatVectorsReader instanceof QuantizedVectorsReader) {
+  return ((QuantizedVectorsReader) 
flatVectorsReader).getQuantizedVectorValues(field);

Review Comment:
   Bump: this sounds like a good addition to the flat vectors reader?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] TestDeletionPolicy.testOpenPriorSnapshot failing [lucene]

2023-11-01 Thread via GitHub


benwtrent opened a new issue, #12746:
URL: https://github.com/apache/lucene/issues/12746

   ### Description
   
   TestDeletionPolicy.testOpenPriorSnapshot fails. Assertion fails on assuming 
we are on a previous commit with more than one leaf.
   
   Fails on main and 9.x.
   
   
   
stack trace 
   
   ```
   org.apache.lucene.index.TestDeletionPolicy > testOpenPriorSnapshot FAILED
   java.lang.AssertionError
   at 
__randomizedtesting.SeedInfo.seed([5B24813C1DE35AA2:32FD48B202239F7A]:0)
   at org.junit.Assert.fail(Assert.java:87)
   at org.junit.Assert.assertTrue(Assert.java:42)
   at org.junit.Assert.assertTrue(Assert.java:53)
   at 
org.apache.lucene.index.TestDeletionPolicy.testOpenPriorSnapshot(TestDeletionPolicy.java:524)
   at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
   at java.base/java.lang.reflect.Method.invoke(Method.java:580)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
   at 
org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
   at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
   at 
org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
   at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
   at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
   at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
   at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
   at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
   at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
   at 
org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
   at java.base/java.lang.Thread.run(Thread.java:1583)
   ```
   
   
   
   ### Gradle command to reproduce
   
   ```
   ./gradlew :lucene:core:test --tests 
"org.apache.lucene.index.TestDeletionPolicy.testOpenPriorSnapshot" 
-Ptests.jvms=12 -P

Re: [I] TestDeletionPolicy.testOpenPriorSnapshot failing [lucene]

2023-11-01 Thread via GitHub


benwtrent commented on issue #12746:
URL: https://github.com/apache/lucene/issues/12746#issuecomment-1789639897

   I verified this doesn't happen in 9.8.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] TestDeletionPolicy.testOpenPriorSnapshot failing [lucene]

2023-11-01 Thread via GitHub


jpountz closed issue #12746: TestDeletionPolicy.testOpenPriorSnapshot failing
URL: https://github.com/apache/lucene/issues/12746


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] TestDeletionPolicy.testOpenPriorSnapshot failing [lucene]

2023-11-01 Thread via GitHub


jpountz commented on issue #12746:
URL: https://github.com/apache/lucene/issues/12746#issuecomment-1789651692

   Thanks for reporting, I pushed a fix at 
https://github.com/apache/lucene/commit/66324f763fc7fb0d8e7cd6f334e5438f0171c84e.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speed up disjunctions by computing estimations of the score of the k-th top hit up-front. [lucene]

2023-11-01 Thread via GitHub


jpountz commented on PR #12526:
URL: https://github.com/apache/lucene/pull/12526#issuecomment-1789662266

   @mikemccand  FYI I gave a try at adding some interesting boolean queries to 
nightly benchmarks at https://github.com/mikemccand/luceneutil/pull/240.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Clean up UnCompiledNode.inputCount [lucene]

2023-11-01 Thread via GitHub


dungba88 closed pull request #12735: Clean up UnCompiledNode.inputCount
URL: https://github.com/apache/lucene/pull/12735


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-01 Thread via GitHub


dungba88 commented on code in PR #12738:
URL: https://github.com/apache/lucene/pull/12738#discussion_r1378882326


##
lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java:
##
@@ -186,132 +214,99 @@ private long hash(FSTCompiler.UnCompiledNode node) {
 return h;
   }
 
-  // hash code for a frozen node.  this must precisely match the hash 
computation of an unfrozen
-  // node!
-  private long hash(long node) throws IOException {
-final int PRIME = 31;
-
-long h = 0;
-fstCompiler.fst.readFirstRealTargetArc(node, scratchArc, in);
-while (true) {
-  h = PRIME * h + scratchArc.label();
-  h = PRIME * h + (int) (scratchArc.target() ^ (scratchArc.target() >> 
32));
-  h = PRIME * h + scratchArc.output().hashCode();
-  h = PRIME * h + scratchArc.nextFinalOutput().hashCode();
-  if (scratchArc.isFinal()) {
-h += 17;
-  }
-  if (scratchArc.isLast()) {
-break;
-  }
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-return h;
-  }
-
-  /**
-   * Compares an unfrozen node (UnCompiledNode) with a frozen node at byte 
location address (long),
-   * returning true if they are equal.
-   */
-  private boolean nodesEqual(FSTCompiler.UnCompiledNode node, long address) 
throws IOException {
-fstCompiler.fst.readFirstRealTargetArc(address, scratchArc, in);
-
-// fail fast for a node with fixed length arcs
-if (scratchArc.bytesPerArc() != 0) {
-  assert node.numArcs > 0;
-  // the frozen node uses fixed-with arc encoding (same number of bytes 
per arc), but may be
-  // sparse or dense
-  switch (scratchArc.nodeFlags()) {
-case FST.ARCS_FOR_BINARY_SEARCH:
-  // sparse
-  if (node.numArcs != scratchArc.numArcs()) {
-return false;
-  }
-  break;
-case FST.ARCS_FOR_DIRECT_ADDRESSING:
-  // dense -- compare both the number of labels allocated in the array 
(some of which may
-  // not actually be arcs), and the number of arcs
-  if ((node.arcs[node.numArcs - 1].label - node.arcs[0].label + 1) != 
scratchArc.numArcs()
-  || node.numArcs != FST.Arc.BitTable.countBits(scratchArc, in)) {
-return false;
-  }
-  break;
-default:
-  throw new AssertionError("unhandled scratchArc.nodeFlag() " + 
scratchArc.nodeFlags());
-  }
-}
-
-// compare arc by arc to see if there is a difference
-for (int arcUpto = 0; arcUpto < node.numArcs; arcUpto++) {
-  final FSTCompiler.Arc arc = node.arcs[arcUpto];
-  if (arc.label != scratchArc.label()
-  || arc.output.equals(scratchArc.output()) == false
-  || ((FSTCompiler.CompiledNode) arc.target).node != 
scratchArc.target()
-  || arc.nextFinalOutput.equals(scratchArc.nextFinalOutput()) == false
-  || arc.isFinal != scratchArc.isFinal()) {
-return false;
-  }
-
-  if (scratchArc.isLast()) {
-if (arcUpto == node.numArcs - 1) {
-  return true;
-} else {
-  return false;
-}
-  }
-
-  fstCompiler.fst.readNextRealArc(scratchArc, in);
-}
-
-// unfrozen node has fewer arcs than frozen node
-
-return false;
-  }
-
   /** Inner class because it needs access to hash function and FST bytes. */
   private class PagedGrowableHash {
-private PagedGrowableWriter entries;
+public long copiedBytes;
+// storing the FST node address where the position is the masked hash of 
the node arcs

Review Comment:
   > I was thinking we could do this instead with two PagedGrowableWriter, both 
of which being addressed by the hash(fstNodeAddress) % mask?
   
   It's a bit tricky as that means we need to know the masked hash of the node 
when computing the hash. So I put that masked hash (computed using the unfrozen 
nodes) into the hash function. A bit messy but it removes one hash tables.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12737:
URL: https://github.com/apache/lucene/pull/12737#issuecomment-1789905123

   OK. sounds good to me @uschindler . I agree the user should be in control. 
And I'd like to avoid us adding tons of flags for this stuff unless we need it 
for testing. that's just my opinion. So i really like respecting JVM flags for 
stuff like UseFMA too.
   
   But existing UseFMA has bad defaults in the JDK, that is my problem. Sure 
you get a little extra precision "for free" if you are doing trivial stuff in a 
serial loop. but if you are trying to max out performance, it only makes sense 
for intel cpus from what i've seen so far. When trying to max out the 
throughput on intel, you need FMA on. When its amd or arm neon, you need to use 
mul+add to keep everything busier.
   
   And I don't recommend any user TURN OFF UseFMA for any reason, ever. Chances 
are they are doing more than just using lucene library in their application and 
this could have disastrous consequences and slowdowns. So I agree with a 
specific sysprop just for this case. It is a unique situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] stabilize vectorutil benchmark [lucene]

2023-11-01 Thread via GitHub


rmuir opened a new pull request, #12747:
URL: https://github.com/apache/lucene/pull/12747

   This benchmark is too noisy across forks which makes comparisons impossible 
and misleading. Especially vectorized float methods on some machines: it is 
almost useless.
   
   I spent some time to reduce the noise, unfortunately it results in the 
benchmark being 2x slower. We can do tricky jvm args but what is best is to 
just do many forks. 
   
   I now see reasonably stable results on Graviton3E with 256-bit ARM SVE (!) 
and look forward to actually investigating this thing more than just 
stabilizing this benchmark :)
   ```
   Benchmark   (size)   Mode  Cnt   Score
Error   Units
   VectorUtilBenchmark.binaryCosineScalar1024  thrpt   15   0.842 ±  
0.001  ops/us
   VectorUtilBenchmark.binaryCosineVector1024  thrpt   15   4.811 ±  
0.006  ops/us
   VectorUtilBenchmark.binaryDotProductScalar1024  thrpt   15   2.370 ±  
0.001  ops/us
   VectorUtilBenchmark.binaryDotProductVector1024  thrpt   15   8.037 ±  
0.001  ops/us
   VectorUtilBenchmark.binarySquareScalar1024  thrpt   15   2.450 ±  
0.034  ops/us
   VectorUtilBenchmark.binarySquareVector1024  thrpt   15   6.703 ±  
0.010  ops/us
   VectorUtilBenchmark.floatCosineScalar 1024  thrpt   15   0.682 ±  
0.001  ops/us
   VectorUtilBenchmark.floatCosineVector 1024  thrpt   75   5.498 ±  
0.004  ops/us
   VectorUtilBenchmark.floatDotProductScalar 1024  thrpt   15   2.409 ±  
0.013  ops/us
   VectorUtilBenchmark.floatDotProductVector 1024  thrpt   75  12.446 ±  
0.343  ops/us
   VectorUtilBenchmark.floatSquareScalar 1024  thrpt   15   2.172 ±  
0.007  ops/us
   VectorUtilBenchmark.floatSquareVector 1024  thrpt   75  11.655 ±  
0.152  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] stabilize vectorutil benchmark [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12747:
URL: https://github.com/apache/lucene/pull/12747#issuecomment-1790041666

   and i know it is annoying it runs slower, i really tried to not be overkill, 
but some methods here are super simple and super stable and some are unstable. 
So I gave all methods minimum 3 forks for safety: this way if there is noise 
between forks we actually see that (!). Those vector float methods get 15 forks 
for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-01 Thread via GitHub


rmuir commented on PR #12737:
URL: https://github.com/apache/lucene/pull/12737#issuecomment-1790097686

   ARM Graviton3E (256-bit SVE)
   
   Master (well #12747 for sanity):
   ```
   Benchmark   (size)   Mode  Cnt   Score   
Error   Units
   VectorUtilBenchmark.floatCosineScalar 1024  thrpt   15   0.682 ±  
0.001  ops/us
   VectorUtilBenchmark.floatCosineVector 1024  thrpt   75   5.498 ±  
0.004  ops/us
   VectorUtilBenchmark.floatDotProductScalar 1024  thrpt   15   2.409 ±  
0.013  ops/us
   VectorUtilBenchmark.floatDotProductVector 1024  thrpt   75  12.446 ±  
0.343  ops/us
   VectorUtilBenchmark.floatSquareScalar 1024  thrpt   15   2.172 ±  
0.007  ops/us
   VectorUtilBenchmark.floatSquareVector 1024  thrpt   75  11.655 ±  
0.152  ops/us
   ```
   
   Patch
   ```
   Benchmark   (size)   Mode  Cnt   Score   
Error   Units
   VectorUtilBenchmark.floatCosineScalar 1024  thrpt   15   1.423 ± 
0.001  ops/us
   VectorUtilBenchmark.floatCosineVector 1024  thrpt   75   8.882 ± 
0.052  ops/us
   VectorUtilBenchmark.floatDotProductScalar 1024  thrpt   15   3.751 ± 
0.006  ops/us
   VectorUtilBenchmark.floatDotProductVector 1024  thrpt   75  13.033 ± 
0.056  ops/us
   VectorUtilBenchmark.floatSquareScalar 1024  thrpt   15   3.201 ± 
0.014  ops/us
   VectorUtilBenchmark.floatSquareVector 1024  thrpt   75  12.943 ± 
0.250  ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-11-01 Thread via GitHub


shubhamvishu commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1790129043

   Nice!I like it how we discovered a build issue bug(which was hidden for 
sometime now I think) and fixed it. Thanks for solving @dweiss!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org