Re: [PR] Add new int8 scalar quantization to HNSW codec [lucene]

via GitHub Wed, 18 Oct 2023 09:26:17 -0700


benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1364177301



##########
lucene/core/src/java/org/apache/lucene/util/ScalarQuantizer.java:
##########
@@ -0,0 +1,317 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.util;
+
+import static org.apache.lucene.search.DocIdSetIterator.NO_MORE_DOCS;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Random;
+import java.util.stream.IntStream;
+import org.apache.lucene.index.FloatVectorValues;
+import org.apache.lucene.index.VectorSimilarityFunction;
+
+/**
+ * Will scalar quantize float vectors into `int8` byte values. This is a lossy 
transformation.
+ * Scalar quantization works by first calculating the quantiles of the float 
vector values. The
+ * quantiles are calculated using the configured quantile/confidence interval. 
The [minQuantile,
+ * maxQuantile] are then used to scale the values into the range [0, 127] and 
bucketed into the
+ * nearest byte values.
+ *
+ * <h2>How Scalar Quantization Works</h2>
+ *
+ * <p>The basic mathematical equations behind this are fairly straight 
forward. Given a float vector
+ * `v` and a quantile `q` we can calculate the quantiles of the vector values 
[minQuantile,
+ * maxQuantile].
+ *
+ * <pre class="prettyprint">
+ *   byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
+ *   float = (maxQuantile - minQuantile)/127 * byte + minQuantile
+ * </pre>
+ *
+ * <p>This then means to multiply two float values together (e.g. dot_product) 
we can do the
+ * following:
+ *
+ * <pre class="prettyprint">
+ *   float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + 
minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
+ *   float1 * float2 ~= (byte1 * byte2 * (maxQuantile - 
minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - 
minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + 
minQuantile^2
+ *   let alpha = (maxQuantile - minQuantile)/127
+ *   float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * 
alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
+ * </pre>
+ *
+ * <p>The expansion for square distance is much simpler:
+ *
+ * <pre class="prettyprint">
+ *  square_distance = (float1 - float2)^2
+ *  (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - 
minQuantile)^2
+ *  = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 
2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
+ *  this can be simplified to:
+ *  = alpha^2 (byte1 - byte2)^2
+ * </pre>
+ */
+public class ScalarQuantizer {
+
+  public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE = 25_000;

Review Comment:
   This is empirical and fairly conservative (on the side of getting good 
quantiles). It probably could be configurable, but I went for simple first with 
sane defaults.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add new int8 scalar quantization to HNSW codec [lucene]

Reply via email to