[GitHub] [lucene] abhirkz commented on issue #9883: Add IndexSearcher Level Query Max Clause Count Limits [LUCENE-8839]

2023-06-12 Thread via GitHub


abhirkz commented on issue #9883:
URL: https://github.com/apache/lucene/issues/9883#issuecomment-1586719249

   Any update on this ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1586750531

   Hi,
   so this looks like 2 different issues:
   - incorrect validation of query and indexed field contents.
   - some vectors cause NaN when the cosine is calculated. I am not sure how 
this can happen, because cosine should only be NaN when the argument of the 
square root is negative or NaN. To me this indeed looks like a rounding error. 
Maybe we should guard the cosine to not allow negative numbers. This can only 
happen for special cases, so maybe `Math.abs()` is fine to work around. 
Actually the cosine should never be outside `[-1 .. 1]`.
   
   I think instead of throwing an exception (which would be crazy to the end 
user we should make sure that due to rounding errors the argument of the cosine 
cannot get negative.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1586774660

   I think I know why it happens, in both providers we calculate the cosine 
like that and when applying the sqrt it gets negative for one of those reasons:
   
https://github.com/apache/lucene/blob/ef35e6edf4d38d88603b6ca2540f53913c5d9c2d/lucene/core/src/java/org/apache/lucene/util/VectorUtilDefaultProvider.java#L105
   
   - norm1 or norm2 is negative
   - we don't cast both factors to double before multiplication, so it 
overflows the exponent of the float
   
   The other similarity functions are not affected, but we can possibly add 
some guard her by casting both floats to double before multiplying them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1586803881

   I debugged through it: The vector causes `norm1` and `norm2`, as well as 
`sum` to get `Infinity`. `Infinity/Infinity` results in `NaN`.
   
   So it is not caused by sqrt. In general you are right this might happen with 
any vector if the exponent overflows while summing up the component squares.
   
   I am not sure how to work around that. Maybe it should return 1, if the sqrt 
gets infinity?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587138822

   I think the related issue found with some vectors creating an NaN cosine 
(happens when the floats are too large by exponent and the result gets infinity 
after multiplication) is a separate one. I think we should open an issue.
   
   This vector causes the assert to trigger:
   ```
 public void testCosineNaN() {
   final float[] v = new float[] { 1.E31f };
   assertEquals(1f, VectorUtil.cosine(v, v), DELTA);
 }
   ```
   
   I figured out that I did not check explicitely for "empty" vectors. I may 
add a test for that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587147803

   I think we could decide to also disallow such vectors. If the square of one 
its components gets infinity it should maybe also be rejected. What do you 
think?
   
   ```java
   float y= 1e31f * 1e31f;
   System.err.println(y); // prints "Infinity"
   ```
   
   Maybe we change the infinite check to use square?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587159643

   Actually with the current similarity methods we should make sure that for 
each vector component `v`, the following is true: `Math.isFinite(v * v * 
vector.length)`.
   
   This is quite easy to implement in the isFinite check,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587183519

   I modified the function like that, but have not yet committed:
   
   ```java
 /**
  * Checks if a float vector only has finite components and the square of 
its components multiplied by vector dimensions is finite.
  *
  * @param v bytes containing a vector
  * @return the vector for call-chaining
  * @throws IllegalArgumentException if any component of vector violates 
the contract
  */
 public static float[] checkFinite(float[] v) {
   for (int i = 0; i < v.length; i++) {
 float f = v[i] * v[i] * v.length;
 if (!Float.isFinite(f)) {
   throw new IllegalArgumentException("non-finite or too large (with 
respect to dimensions) value at vector[" + i + "]=" + v[i]);
 }
   }
   return v;
 }
   ```
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-158716

   The above check should auto-vectorize in Hotspot, so the check during 
indexing/searching should be cheap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


benwtrent commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226584452


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }
+
+  @Override
+  public byte[] byteVectorVal(int doc) throws IOException {
+byte[] bytesVector = null;
+if (byteValues != null && exists(byteValues, doc)) {
+  bytesVector = byteValues.vectorValue();
+}
+return bytesVector;

Review Comment:
   This further shows that we should not do under the hood casting. Casting a 
`byte` encoded vector to float should mean we can then cast a `float` encoded 
as `byte` for the user. 



##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorValueSource.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+
+/** function that returns a constant vector value for every document. */
+public class KnnVectorValueSource extends ValueSource {

Review Comment:
   I am not sure the need of this class at all. However, we should indicate its 
constant in the name.
   
   Additionally, we should separate out the `float[]

[GitHub] [lucene] benwtrent commented on pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


benwtrent commented on PR #12253:
URL: https://github.com/apache/lucene/pull/12253#issuecomment-1587256870

   @alessandrobenedetti , I see many `to discuss` messages, but no discussion 
clearly labeled in Lucene or the dev lists. My searching skills may be failing 
me. Could you link these discussions or at least highlight the reasoning for 
the decisions made and which decisions were made?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226632668


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   Yes, please do not do this! You cannot simply cast a byte[] to float without 
transforming its value in a valid way. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226634599


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   It should throw exception that the field type does not match the query 
vector.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] eliaporciani commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


eliaporciani commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226642488


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   Initially, we had the two implementations separated for byte and float 
vectors. This is the commit in which we changed the approach 
[cb4b77c9f72551cb014d155586e399365fbcba5d](https://github.com/apache/lucene/pull/12253/commits/cb4b77c9f72551cb014d155586e399365fbcba5d).
 We decided to make an extra step and check if it was possible to abstract over 
the vector encoding in order to provide a simpler API to the caller.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


benwtrent commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587319629

   @uschindler I think this change (`v[i] * v[i] * v.length`) is getting 
complicated. It really makes me think about if we should do any complex 
infinite checking other than `isFinite(float)`. For example, this check would 
prevent vectors that may be valid for `squareDistance`.
   
   If a specific similarity has `NaN` issues, than a more complicated 
`isFinite` check should account for those. But, this then opens the door to how 
`KnnFloatVectorQuery` would even know about the similarity used, or how that 
could be checked at all.
   
   I would prefer us doing the simple float value validation for now and we 
open an issue to see if we can get a better API for checking similarity 
infinity checks (e.g. `squareDistance` vs `cosine`) and if we should have those 
at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] eliaporciani commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


eliaporciani commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226655236


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   Initially, we had separate implementations for byte and float vectors. The 
change in approach can be seen in the commit 
[cb4b77c9f72551cb014d155586e399365fbcba5d](https://github.com/apache/lucene/pull/12253/commits/cb4b77c9f72551cb014d155586e399365fbcba5d).
 We made the decision to explore the possibility of abstracting over the vector 
encoding to create a simpler API for the caller.
   
   There were two main reasons for pursuing this change:
   
   - Code duplication: Having separate implementations resulted in duplicate 
code for byte and float vectors.
   
   - Difficulty in using float[] and byte[] vectors interchangeably: The knn 
vectors, whether byte or float, essentially represented the same field. The 
difference lay in the underlying encoding and the precision used to store the 
vectors. The decision to use the byte vector format over the float vector 
format was primarily driven by non-functional aspects such as space occupancy 
and performance.
   
   The proposed approach strikes a good balance in terms of abstraction level 
for both types of vectors.
   
   If there are compelling reasons for not proceeding with this approach, we 
could consider reverting to the point where the two implementations were 
separate.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


benwtrent commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226688194


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   > Code duplication: Having separate implementations resulted in duplicate 
code for byte and float vectors.
   
   This is a non-issue. A consistent API is way better than implementing "DRY". 
   
   > Difficulty in using float[] and byte[] vectors interchangeably:
   
   They should not be used interchangeably. `float[]` should return `float[]`, 
`byte[]` should return `byte[]`. Users stored them that way, that is the way 
they should be retrieved.
   
   > The knn vectors, whether byte or float, essentially represented the same 
field
   
   This is making a big assumption on the user. I don't understand this jump in 
logic. They are not the same field nor the same field type.
   
   > The difference lay in the underlying encoding and the precision used to 
store the vectors
   
   This is not true. `byte[]` means you stored them as `int8` values. They 
COULD be quantized `float[]` values, or the user just has `byte[]` values.
   
   Linear quantization could be construed as a "precision" thing, but that is 
simplifying what quantization is, and how its implemented.
   
   As an argument against "just precision" users should NEVER attempt to store 
something like `float[]{1.22348f, 0.2384f}` as a `byte` encoded vector without 
some quantization process in between creating the vector and storing them. 
Simply truncating is bad, will cause errors, and is a surprising result.
   
   > The decision to use the byte vector format over the float vector format 
was primarily driven by non-functional aspects such as space occupancy and 
performance.
   
   This is partially true. Space and performance are important, but casting 
between `float` and `byte` breaks an otherwise strong contract with the API 
calls.
   
   Additionally, I would say that generally, there are other valid reasons for 
`int8` storage other than "Hey I quantized the output of some floats". What if 
the user just wanted to store `int8` because it worked best for their data? 
   
   > If there are compelling reasons for not proceeding with this app

[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587489458

   That would be my plan! Let's open a new issue and discuss that there.
   
   So we should merge this PR for now. Before doing that I will only add the 
"vector size>0" check in the API, because that may be missing at some places. A 
zero length vector also causes havoc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] eliaporciani commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


eliaporciani commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226796096


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   As mentioned earlier, I appreciate your input, and I would like to further 
explain the rationale behind our attempt to abstract over the two encodings. I 
understand your concerns regarding quantization, and I agree that in the 
majority of cases, casting from byte to float may not be meaningful. However, 
it's important to note that the single API we proposed does not restrict users 
from utilizing the API appropriately in the case of quantization.
   
   Regarding code duplication, I think that it can be problematic. Currently, 
we have two encodings, but there may be a need to introduce additional 
encodings in the future. Duplicating the same code for each encoding increases 
the likelihood of errors in future contributions.
   
   I want to emphasize that I do not hold a strong opinion about which approach 
is the best. I acknowledge that there are advantages and disadvantages to both 
options. I am happy to revert the last commits and provide separate 
implementations for byte and float vectors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] eliaporciani commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


eliaporciani commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226796096


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   As mentioned earlier, I appreciate your input, and I would like to further 
explain the rationale behind our attempt to abstract over the two encodings. I 
understand your concerns regarding quantization, and I agree that in the 
majority of cases, casting from byte to float may not be meaningful. However, 
it's important to note that the single API we proposed does not restrict users 
from utilizing the API appropriately in the case of quantization.
   
   Regarding code duplication, I think that it can be problematic. Currently, 
we have two encodings, but there may be a need to introduce additional 
encodings in the future. Duplicating the same code for each encoding increases 
the likelihood of errors in future contributions.
   
   I want to emphasize that I have not a strong opinion about which approach is 
the best. I acknowledge that there are advantages and disadvantages to both 
options. I am happy to revert the last commits and provide separate 
implementations for byte and float vectors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] cfournie commented on a diff in pull request #12249: Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth

2023-06-12 Thread via GitHub


cfournie commented on code in PR #12249:
URL: https://github.com/apache/lucene/pull/12249#discussion_r1226823890


##
lucene/core/src/test/org/apache/lucene/util/graph/TestGraphTokenStreamFiniteStrings.java:
##
@@ -16,6 +16,9 @@
  */
 package org.apache.lucene.util.graph;
 
+import static org.apache.lucene.util.automaton.Operations.MAX_RECURSION_LEVEL;

Review Comment:
   Removed, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] cfournie commented on a diff in pull request #12249: Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth

2023-06-12 Thread via GitHub


cfournie commented on code in PR #12249:
URL: https://github.com/apache/lucene/pull/12249#discussion_r1226824178


##
lucene/CHANGES.txt:
##
@@ -80,6 +80,8 @@ Bug Fixes
 
 * GITHUB#12220: Hunspell: disallow hidden title-case entries from compound 
middle/end
 
+* LUCENE-10181: Restrict 
GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth. (Chris 
Fournier)

Review Comment:
   Done; also rebased to pick up the latest changes in this file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


benwtrent commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226828445


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   > I understand your concerns regarding quantization, and I agree that in the 
majority of cases, casting from byte to float may not be meaningful. However, 
it's important to note that the single API we proposed does not restrict users 
from utilizing the API appropriately in the case of quantization.
   
   API's by definition restrict users to the correct use case. I think 
auto-casting is a bad idea. It gives users the ability to shoot themselves in 
the foot at runtime.
   
   >  Currently, we have two encodings, but there may be a need to introduce 
additional encodings in the future. Duplicating the same code for each encoding 
increases the likelihood of errors in future contributions.
   
   I say we cross that bridge when we get to it. There are ways to deduplicate 
the code while still giving the users a nice, safe, and correct API.
   
   > I want to emphasize that I have not a strong opinion about which approach 
is the best. I acknowledge that there are advantages and disadvantages to both 
options.
   
   We need to do a Float version and a Byte version. They can rely on similar 
code to cover your duplication concerns (as long as this doesn't break the 
design). But, we should have consistency with the existing API design and 
protect users from making easy mistakes.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587551398

   Hi, I added the dimension check to the constructor which uses a predefined 
field type. For the query it can't be done in the constructor, as we do not 
know the field type. The query will fail later, so an explicit check is not 
needed.
   
   I think this is ready to be merged. @jbellis we should open a separate issue 
to make sure that we find a solution for vectors where the float overflows 
(infinity) leading to infinite scores. I strongly disagree to use double math, 
we should maybe have some documentation checks like proposed above. We could 
also enforce it, but that's harder to decide!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226849047


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   I fully agree with @benwtrent. We also have separate queries for byte and 
float vectors, so we should not mix the types together in function queries.
   
   It wil really hurt users. I know Solr people tend to sometimes "try to 
change their schema", but this is a no-go for byte and float vectors as other 
queries will fail but this one suddenly works magically. No, we should not do 
this. Field types in Lucene were relaxed at beginning but we spent more and 
more time into making them more strict. We only have some autoconverts in 
docvalues (e.g., SortedDocValues can be queries and sorted using 
SortedSetDocValues which simplifies code - bt there is no surprises and loss or 
convert of information).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on a diff in pull request #12249: Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth

2023-06-12 Thread via GitHub


jpountz commented on code in PR #12249:
URL: https://github.com/apache/lucene/pull/12249#discussion_r1226846973


##
lucene/core/src/java/org/apache/lucene/util/graph/GraphTokenStreamFiniteStrings.java:
##
@@ -45,6 +45,8 @@
  * different paths of the {@link Automaton}.
  */
 public final class GraphTokenStreamFiniteStrings {
+  /** Maximum level of recursion allowed in recursive operations. */
+  public static final int MAX_RECURSION_LEVEL = 1000;

Review Comment:
   let's make it private?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12294: Implement MMapDirectory with Java 21 Project Panama Preview API

2023-06-12 Thread via GitHub


uschindler commented on PR #12294:
URL: https://github.com/apache/lucene/pull/12294#issuecomment-1587577046

   I talked with @mcimadamore in the meantime. At least for the Panama Foreign 
API (FFI) there are no changes planned anymore, so this is safe to merge into 
main branch.
   
   Keep in mind: It could always happen that there is a major API bug that 
requires a change, but this sounds very unlikely to me. There is still a PR to 
update FFI Javadocs, but the API is final.
   
   For the vector API (which will be a separate PR), I see no problem. The 
vector part is uncritical, as users have to "opt in". If the API changes till 
JDK 21 release, it won't hurt anybody. We can update the impl in a followup 
release so it works.
   
   @ChrisHegarty, are you fine with merging this now, so we can port the vector 
part?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mikemccand commented on pull request #12357: Better paging when random reads go backwards

2023-06-12 Thread via GitHub


mikemccand commented on PR #12357:
URL: https://github.com/apache/lucene/pull/12357#issuecomment-1587578171

   `wikimedium10k` is a tiny corpus, really just for quick testing that your 
benchy is setup correctly.  If you look at the QPS of each task they are 
ridiculously high :)
   
   In the future it's better to run on at least `wikimedium10M` or 
`wikimediumall`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jbellis commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


jbellis commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1587588219

   SGTM, thank you for taking the lead on investigating the root cause of the 
NaNs!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on pull request #12360: Update comment about IndexOptions ordinals

2023-06-12 Thread via GitHub


msokolov commented on PR #12360:
URL: https://github.com/apache/lucene/pull/12360#issuecomment-1587648171

   It sounds to me as if order is no longer important. We wouldn't change the 
order because these are stored in the index, but if say someone was adding a 
new option for some reason, they shouldn't be encouraged to try to maintain a 
priority order? Just at it at the end right? I think you could entirely delete 
the comment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


alessandrobenedetti commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226909040


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   > Yes, please do not do this! You cannot simply cast a byte[] to float 
without transforming its value in a valid way.
   
   It was not converting a byte[] to float, it was converting a byte 
(containing a number between -128 and +127) to a float:
   floatVector[i] = byteVector[i];
   What would be the valid way of doing it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


alessandrobenedetti commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226910532


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   thanks, everyone for the valuable feedback, @eliaporciani will revert the 
pull request to the commit where we had the working version with duplicated 
byte/float code in different classes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


alessandrobenedetti commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226910532


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   thanks @benwtrent  and @uschindler  for the valuable feedback, we discussed 
with @eliaporciani and he will revert the pull request to the commit where we 
had the working version with duplicated byte/float code in different classes, 
I'll do a review tomorrow when ready and I would love your opinion before 
merging.
   
   That's also the reason we kept this PR open so long before merging, 
additional feedback is always welcome!
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz merged pull request #12249: Restrict GraphTokenStreamFiniteStrings#articulationPointsRecurse recursion depth

2023-06-12 Thread via GitHub


jpountz merged PR #12249:
URL: https://github.com/apache/lucene/pull/12249


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz closed issue #11218: GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows [LUCENE-10181]

2023-06-12 Thread via GitHub


jpountz closed issue #11218: 
GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack 
overflows [LUCENE-10181]
URL: https://github.com/apache/lucene/issues/11218


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on issue #12358: Optimize `count()` for BooleanQuery disjunction

2023-06-12 Thread via GitHub


msokolov commented on issue #12358:
URL: https://github.com/apache/lucene/issues/12358#issuecomment-1587661349

   > No no. There is no specific capitalization :). I was correcting tantivity 
-> tantivy.
   
   Just to keep everyone accurate. the original misspelling was "tantitvy" not 
"tantivity"! :)
   
   > I checked the repository. When Tantitvy says it is 2 times faster ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


benwtrent commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1226928216


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   @alessandrobenedetti I am not saying that `floatVector[i] = byteVector[i];` 
would return invalid data. I am saying there are API assumptions:
   
- That if `float` can do this, then so can `byte`?? (which it cannot, seems 
weird from an API standpoint)
- Users accidentally abusing `byte[]` with a float query (they forgot to 
quantize their float query?!?!?).
   
   Keeping these separate clarifies their usage and makes it easier for users 
(but I agree its more code for us all to write :) )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on a diff in pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors

2023-06-12 Thread via GitHub


benwtrent commented on code in PR #12281:
URL: https://github.com/apache/lucene/pull/12281#discussion_r1226928953


##
lucene/core/src/java/org/apache/lucene/document/KnnByteVectorField.java:
##
@@ -137,7 +137,12 @@ public KnnByteVectorField(String name, byte[] vector, 
FieldType fieldType) {
   + " using byte[] but the field encoding is "
   + fieldType.vectorEncoding());
 }
-fieldsData = Objects.requireNonNull(vector, "vector value must not be 
null");
+Objects.requireNonNull(vector, "vector value must not be null");
+if (vector.length != fieldType.vectorDimension()) {

Review Comment:
   Good catch!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] sohami commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation

2023-06-12 Thread via GitHub


sohami commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1587721480

   @jpountz and @javanna Seems like there is plan of feature freeze by this Fri 
(06/16) for next release. I was really hoping if we can get it in the 9.x 
release and would appreciate your help on this issue and how to progress on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jmazanec15 commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-06-12 Thread via GitHub


jmazanec15 commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1587866962

   @benwtrent  Thanks for taking a look! Interesting, I am not too familiar 
with MBW. Ill take a look. 
   
   The main reason I wanted to avoid returning the dot product was to avoid 
negative scores, as ref by https://github.com/apache/lucene/issues/9044.
   
   Also, what do you mean by scaling continuously? The above formula gives 
negative and positive scores the same number of possible scores, reducing 
overall precision by 2 (please correct me if I am wrong).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1227084420


##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/KnnVectorFieldSource.java:
##
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.queries.function.valuesource;
+
+import java.io.IOException;
+import java.util.Map;
+import org.apache.lucene.index.ByteVectorValues;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.queries.function.FunctionValues;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DocIdSetIterator;
+
+/** An implementation for retrieving {@link FunctionValues} instances for knn 
vectors fields. */
+public class KnnVectorFieldSource extends ValueSource {
+  private final String fieldName;
+
+  public KnnVectorFieldSource(String fieldName) {
+this.fieldName = fieldName;
+  }
+
+  @Override
+  public FunctionValues getValues(Map context, 
LeafReaderContext readerContext)
+  throws IOException {
+
+final FloatVectorValues floatValues = 
readerContext.reader().getFloatVectorValues(fieldName);
+final ByteVectorValues byteValues = 
readerContext.reader().getByteVectorValues(fieldName);
+
+if (floatValues == null && byteValues == null) {
+  throw new IllegalArgumentException(
+  "no vector value is indexed for field '" + fieldName + "'");
+}
+return new FunctionValues() {
+  int lastDocID;
+
+  @Override
+  public float[] floatVectorVal(int doc) throws IOException {
+float[] floatVector = null;
+if (floatValues != null && exists(floatValues, doc)) {
+  floatVector = floatValues.vectorValue();
+} else {
+  byte[] byteVector = byteVectorVal(doc);
+  if (byteVector != null) {
+floatVector = new float[byteVector.length];
+for (int i = 0; i < byteVector.length; i++) {
+  floatVector[i] = byteVector[i];
+}
+  }
+}
+return floatVector;
+  }

Review Comment:
   I think it should be possible to do it in same way like for field types and 
query: They both have an abstract pkg-private base class. Maybe thats a way 
here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-06-12 Thread via GitHub


benwtrent commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1587878704

   > Also, what do you mean by scaling continuously? 
   
   Your algorithm is piecewise vs. continuous. But, I am not sure how we could 
do a continuous transformation (everything is on the same scale). 
   
   > I am not too familiar with MBW.
   
   Yeah, MBW and MAXSCORE get all mixed up in my brain. But, yes MAXSCORE is 
why we disallow negative scoring. Forgive my misdirection.
   
   > The above formula gives negative and positive scores the same number of 
possible scores, reducing overall precision by 2 
   
   My main concern overall is this: we are changing the scoring methodology 
period for positive scores (and thus considered "valid"). I think this means 
that it cannot go into a Lucene 9.x release (correct @jpountz ?).
   
   What do you think @msokolov ? Maybe we have a new `MAX_INNER_PRODUCT` 
scoring that uses @jmazanec15 suggestion?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler merged pull request #12294: Implement MMapDirectory with Java 21 Project Panama Preview API

2023-06-12 Thread via GitHub


uschindler merged PR #12294:
URL: https://github.com/apache/lucene/pull/12294


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12294: Implement MMapDirectory with Java 21 Project Panama Preview API

2023-06-12 Thread via GitHub


uschindler commented on PR #12294:
URL: https://github.com/apache/lucene/pull/12294#issuecomment-1587934813

   Hi @ChrisHegarty, i think we can start with a JDK 21 branch for Panama 
Vectors, the directories are there. Of course, you need to regenerate the 
apijars with vector classes (that's not included here).
   Backport to 9.x was done, so we can release this with Lucene 9.7.
   
   I will do a crosscheck shortly before release to verify that apijar is 
uptodate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12294: Implement MMapDirectory with Java 21 Project Panama Preview API

2023-06-12 Thread via GitHub


uschindler commented on PR #12294:
URL: https://github.com/apache/lucene/pull/12294#issuecomment-1587947849

   I also removed a relic of the old filename of the apijar in the 9.x folder. 
It was not deleted while cherry-picking when we integrated vector API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty commented on pull request #12294: Implement MMapDirectory with Java 21 Project Panama Preview API

2023-06-12 Thread via GitHub


ChrisHegarty commented on PR #12294:
URL: https://github.com/apache/lucene/pull/12294#issuecomment-1587973337

   > Hi @ChrisHegarty, i think we can start with a JDK 21 branch for Panama 
Vectors, the directories are there. Of course, you need to regenerate the 
apijars with vector classes (that's not included here).
   
   I'll start on this, and hopefully put up a PR by tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty opened a new pull request, #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


ChrisHegarty opened a new pull request, #12363:
URL: https://github.com/apache/lucene/pull/12363

   Port of the Java 20 version of this code to Java 21.
   
   * cut'n'paste VectorUtilPanamaProvider - there are opportunities to 
eventually remove some workarounds, but this is ok for now
   * Updated jdk21.apijar by `./gradlew :lucene:core:regenerate 
-Porg.gradle.java.installations.paths=/Users/chegar/binaries/jdk-21.jdk/Contents/Home/
 ` 
   
   
   ```
   $ /Users/chegar/binaries/jdk-21.jdk/Contents/Home/bin/java -version
   openjdk version "21-ea" 2023-09-19
   OpenJDK Runtime Environment (build 21-ea+26-2328)
   OpenJDK 64-Bit Server VM (build 21-ea+26-2328, mixed mode, sharing)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


ChrisHegarty commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588061449

   I verified this locally by running the tests :
   
   ```
   JENKINS_XX=true ./gradlew :lucene:core:test --tests 
"org.apache.lucene.util.TestVectorUtil**" -Pvalidation.git.failOnModified=false 
--info
   ```
   
   The JDK 21 version is picked up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


ChrisHegarty commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227226190


##
lucene/CHANGES.txt:
##
@@ -139,8 +139,8 @@ New Features
 * GITHUB#12257: Create OnHeapHnswGraphSearcher to let OnHeapHnswGraph to be 
searched in a thread-safety manner. (Patrick Zhai)
 
 * GITHUB#12302, GITHUB#12311: Add vectorized implementations of 
VectorUtil.dotProduct(),
-  squareDistance(), cosine() with Java 20 jdk.incubator.vector APIs. 
Applications started
-  with command line parameter "java --add-modules jdk.incubator.vector" on 
exactly Java 20
+  squareDistance(), cosine() with Java 20 or 21 jdk.incubator.vector APIs. 
Applications started
+  with command line parameter "java --add-modules jdk.incubator.vector" on 
exactly Java 20 or 21
   will automatically use the new vectorized implementations if running on a 
supported platform

Review Comment:
   Since this was introduced and updated in the same minor release we can 
probably just update the wording here to cover 21 also (rather than a separate 
change log entry).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588081418

   That was fast. πŸ˜πŸ‡
   
   I will check tomorrow morning but this looks great. @rmuir  wanted to run 
the benchmark with 21, too.
   
   Did you use latest openjdk 21-ea build to extract?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227231734


##
lucene/core/src/java21/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,493 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.logging.Logger;
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. 
*/
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  private static final int INT_SPECIES_PREF_BIT_SIZE = 
IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  private static final VectorSpecies PREF_FLOAT_SPECIES = 
FloatVector.SPECIES_PREFERRED;
+  private static final VectorSpecies PREF_BYTE_SPECIES;
+  private static final VectorSpecies PREF_SHORT_SPECIES;
+
+  /**
+   * x86 and less than 256-bit vectors.
+   *
+   * it could be that it has only AVX1 and integer vectors are fast. it 
could also be that it has
+   * no AVX and integer vectors are extremely slow. don't use integer vectors 
to avoid landmines.
+   */
+  private static final boolean IS_AMD64_WITHOUT_AVX2 =
+  Constants.OS_ARCH.equals("amd64") && INT_SPECIES_PREF_BIT_SIZE < 256;
+
+  static {
+if (INT_SPECIES_PREF_BIT_SIZE >= 256) {
+  PREF_BYTE_SPECIES =
+  ByteVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+  PREF_SHORT_SPECIES =
+  ShortVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+} else {
+  PREF_BYTE_SPECIES = null;
+  PREF_SHORT_SPECIES = null;
+}
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static  T doPrivileged(PrivilegedAction action) {
+return AccessController.doPrivileged(action);
+  }
+
+  VectorUtilPanamaProvider() {
+if (INT_SPECIES_PREF_BIT_SIZE < 128) {
+  throw new UnsupportedOperationException(
+  "Vector bit size is less than 128: " + INT_SPECIES_PREF_BIT_SIZE);
+}
+
+// hack to work around for JDK-8309727:

Review Comment:
   We can remove this check if you are confident that your change makes it into 
the release.
   On the other hand this check does not hurt.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227232392


##
lucene/core/src/java21/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,493 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.logging.Logger;
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. 
*/
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  private static final int INT_SPECIES_PREF_BIT_SIZE = 
IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  private static final VectorSpecies PREF_FLOAT_SPECIES = 
FloatVector.SPECIES_PREFERRED;
+  private static final VectorSpecies PREF_BYTE_SPECIES;
+  private static final VectorSpecies PREF_SHORT_SPECIES;
+
+  /**
+   * x86 and less than 256-bit vectors.
+   *
+   * it could be that it has only AVX1 and integer vectors are fast. it 
could also be that it has
+   * no AVX and integer vectors are extremely slow. don't use integer vectors 
to avoid landmines.
+   */
+  private static final boolean IS_AMD64_WITHOUT_AVX2 =
+  Constants.OS_ARCH.equals("amd64") && INT_SPECIES_PREF_BIT_SIZE < 256;
+
+  static {
+if (INT_SPECIES_PREF_BIT_SIZE >= 256) {
+  PREF_BYTE_SPECIES =
+  ByteVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+  PREF_SHORT_SPECIES =
+  ShortVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+} else {
+  PREF_BYTE_SPECIES = null;
+  PREF_SHORT_SPECIES = null;
+}
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static  T doPrivileged(PrivilegedAction action) {
+return AccessController.doPrivileged(action);
+  }
+
+  VectorUtilPanamaProvider() {
+if (INT_SPECIES_PREF_BIT_SIZE < 128) {
+  throw new UnsupportedOperationException(
+  "Vector bit size is less than 128: " + INT_SPECIES_PREF_BIT_SIZE);
+}
+
+// hack to work around for JDK-8309727:

Review Comment:
   If we remove, the doPrivileged forbidden wrapper can go away, too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227235078


##
lucene/CHANGES.txt:
##
@@ -139,8 +139,8 @@ New Features
 * GITHUB#12257: Create OnHeapHnswGraphSearcher to let OnHeapHnswGraph to be 
searched in a thread-safety manner. (Patrick Zhai)
 
 * GITHUB#12302, GITHUB#12311: Add vectorized implementations of 
VectorUtil.dotProduct(),

Review Comment:
   Add this PR number here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588094332

   > That was fast. πŸ˜πŸ‡
   > 
   > I will check tomorrow morning but this looks great. @rmuir wanted to run 
the benchmark with 21, too.
   > 
   > Did you use latest openjdk 21-ea build to extract?
   
   Oh you pasted it. Same like mine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


ChrisHegarty commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227243997


##
lucene/core/src/java21/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,493 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.logging.Logger;
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. 
*/
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  private static final int INT_SPECIES_PREF_BIT_SIZE = 
IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  private static final VectorSpecies PREF_FLOAT_SPECIES = 
FloatVector.SPECIES_PREFERRED;
+  private static final VectorSpecies PREF_BYTE_SPECIES;
+  private static final VectorSpecies PREF_SHORT_SPECIES;
+
+  /**
+   * x86 and less than 256-bit vectors.
+   *
+   * it could be that it has only AVX1 and integer vectors are fast. it 
could also be that it has
+   * no AVX and integer vectors are extremely slow. don't use integer vectors 
to avoid landmines.
+   */
+  private static final boolean IS_AMD64_WITHOUT_AVX2 =
+  Constants.OS_ARCH.equals("amd64") && INT_SPECIES_PREF_BIT_SIZE < 256;
+
+  static {
+if (INT_SPECIES_PREF_BIT_SIZE >= 256) {
+  PREF_BYTE_SPECIES =
+  ByteVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+  PREF_SHORT_SPECIES =
+  ShortVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+} else {
+  PREF_BYTE_SPECIES = null;
+  PREF_SHORT_SPECIES = null;
+}
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static  T doPrivileged(PrivilegedAction action) {
+return AccessController.doPrivileged(action);
+  }
+
+  VectorUtilPanamaProvider() {
+if (INT_SPECIES_PREF_BIT_SIZE < 128) {
+  throw new UnsupportedOperationException(
+  "Vector bit size is less than 128: " + INT_SPECIES_PREF_BIT_SIZE);
+}
+
+// hack to work around for JDK-8309727:

Review Comment:
   The JDK change was merged, it’ll be in the next jdk 21 es build.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] tflobbe commented on pull request #12360: Update comment about IndexOptions ordinals

2023-06-12 Thread via GitHub


tflobbe commented on PR #12360:
URL: https://github.com/apache/lucene/pull/12360#issuecomment-1588112203

   But we do things like:
   > @Override
 public boolean hasFreqs() {
   return 
fieldInfo.getIndexOptions().compareTo(IndexOptions.DOCS_AND_FREQS) >= 0;
 }
   
   So we need to keep those ordinals in the right order, right?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a diff in pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on code in PR #12363:
URL: https://github.com/apache/lucene/pull/12363#discussion_r1227252561


##
lucene/core/src/java21/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,493 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import java.security.AccessController;
+import java.security.PrivilegedAction;
+import java.util.logging.Logger;
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.FloatVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShape;
+import jdk.incubator.vector.VectorSpecies;
+
+/** A VectorUtil provider implementation that leverages the Panama Vector API. 
*/
+final class VectorUtilPanamaProvider implements VectorUtilProvider {
+
+  private static final int INT_SPECIES_PREF_BIT_SIZE = 
IntVector.SPECIES_PREFERRED.vectorBitSize();
+
+  private static final VectorSpecies PREF_FLOAT_SPECIES = 
FloatVector.SPECIES_PREFERRED;
+  private static final VectorSpecies PREF_BYTE_SPECIES;
+  private static final VectorSpecies PREF_SHORT_SPECIES;
+
+  /**
+   * x86 and less than 256-bit vectors.
+   *
+   * it could be that it has only AVX1 and integer vectors are fast. it 
could also be that it has
+   * no AVX and integer vectors are extremely slow. don't use integer vectors 
to avoid landmines.
+   */
+  private static final boolean IS_AMD64_WITHOUT_AVX2 =
+  Constants.OS_ARCH.equals("amd64") && INT_SPECIES_PREF_BIT_SIZE < 256;
+
+  static {
+if (INT_SPECIES_PREF_BIT_SIZE >= 256) {
+  PREF_BYTE_SPECIES =
+  ByteVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 2));
+  PREF_SHORT_SPECIES =
+  ShortVector.SPECIES_MAX.withShape(
+  
VectorShape.forBitSize(IntVector.SPECIES_PREFERRED.vectorBitSize() >> 1));
+} else {
+  PREF_BYTE_SPECIES = null;
+  PREF_SHORT_SPECIES = null;
+}
+  }
+
+  // Extracted to a method to be able to apply the SuppressForbidden annotation
+  @SuppressWarnings("removal")
+  @SuppressForbidden(reason = "security manager")
+  private static  T doPrivileged(PrivilegedAction action) {
+return AccessController.doPrivileged(action);
+  }
+
+  VectorUtilPanamaProvider() {
+if (INT_SPECIES_PREF_BIT_SIZE < 128) {
+  throw new UnsupportedOperationException(
+  "Vector bit size is less than 128: " + INT_SPECIES_PREF_BIT_SIZE);
+}
+
+// hack to work around for JDK-8309727:

Review Comment:
   Great, so we can remove it for 21. It is not urgent!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588134825

   I tried it out: I reverted the APIJAR changes and only left in following 
changes:
   - VectorUtilsProvider.java to enable of 21
   - `vectorIncubatorJavaVersions = [ JavaVersion.VERSION_20, 
JavaVersion.VERSION_21 ] as Set`
   
   Everything else was reverted and the tests worked fine. So to me it looks 
like we can spare the separate implementation as it is 100% identical. The hack 
does not hurt.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] ChrisHegarty commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


ChrisHegarty commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588172519

   > I tried it out: I reverted the APIJAR changes and only left in following 
changes:
   > 
   > * VectorUtilsProvider.java to enable of 21
   > * `vectorIncubatorJavaVersions = [ JavaVersion.VERSION_20, 
JavaVersion.VERSION_21 ] as Set`
   > 
   > Everything else was reverted and the tests worked fine. So to me it looks 
like we can spare the separate implementation as it is 100% identical. The hack 
does not hurt.
   
   That’s great - I had a similar thought. πŸ‘


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588191838

   Maybe just leave a readme file in the folder of the Java file stating that 
the impl is identical to Java 20.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12363: Implement VectorUtilProvider with Java 21 Project Pamana Vector API

2023-06-12 Thread via GitHub


uschindler commented on PR #12363:
URL: https://github.com/apache/lucene/pull/12363#issuecomment-1588209511

   I checked the commits: 
https://github.com/openjdk/jdk21/commits/master/src/jdk.incubator.vector/share/classes
   
   There were some changes, but nothing that affects us. It is mostly addition 
of VectorMask.XOR and improvementa in VectorShuffle. This aligns with the JEP: 
https://openjdk.org/jeps/448
   
   And the Turkish locale bug is in the provider lookup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gf2121 merged pull request #12324: Speed up IndexedDISI Sparse #AdvanceExactWithinBlock for tiny step advance

2023-06-12 Thread via GitHub


gf2121 merged PR #12324:
URL: https://github.com/apache/lucene/pull/12324


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gf2121 opened a new pull request, #12364: Speed up IndexedDISI Sparse #AdvanceExactWithinBlock for tiny step advance (Backport 9x)

2023-06-12 Thread via GitHub


gf2121 opened a new pull request, #12364:
URL: https://github.com/apache/lucene/pull/12364

   backport of https://github.com/apache/lucene/pull/12324


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org