date:20210420

[GitHub] [lucene] iverase merged pull request #97: LUCENE-9907: Move PackedInts#getReaderNoHeader() to backwards codec

2021-04-20 Thread GitBox



iverase merged pull request #97:
URL: https://github.com/apache/lucene/pull/97


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9907) Remove dependency on PackedInts#getReader() in all current codecs

2021-04-20 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325545#comment-17325545
 ] 

ASF subversion and git services commented on LUCENE-9907:
-

Commit e0436872c4861f8a3dc3b4e5a52944c3be7ddb2f in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=e043687 ]

LUCENE-9907: Move PackedInts#getReaderNoHeader() to backwards codec



> Remove dependency on PackedInts#getReader() in all current codecs
> -
>
> Key: LUCENE-9907
> URL: https://issues.apache.org/jira/browse/LUCENE-9907
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> PackedInts#getDirectWriter/Reader are really legacy and the way to go now is 
> using DirectReader and DirectWriter. With LUCENE-9705, we should be able to 
> remove them from the current codecs.
> This will help as well to move the Directory API to little endian/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9907) Remove dependency on PackedInts#getReader() in all current codecs

2021-04-20 Thread Ignacio Vera (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-9907.
--
Fix Version/s: main (9.0)
 Assignee: Ignacio Vera
   Resolution: Fixed

> Remove dependency on PackedInts#getReader() in all current codecs
> -
>
> Key: LUCENE-9907
> URL: https://issues.apache.org/jira/browse/LUCENE-9907
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> PackedInts#getDirectWriter/Reader are really legacy and the way to go now is 
> using DirectReader and DirectWriter. With LUCENE-9705, we should be able to 
> remove them from the current codecs.
> This will help as well to move the Directory API to little endian/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

2021-04-20 Thread GitBox



nitirajrathore commented on a change in pull request #83:
URL: https://github.com/apache/lucene/pull/83#discussion_r616403759



##
File path: lucene/test-framework/src/java/org/apache/lucene/util/FullKnn.java
##
@@ -0,0 +1,254 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.FloatBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Locale;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.lucene.index.VectorValues;
+
+/**
+ * A utility class to calculate the Full KNN / Exact KNN over a set of query 
vectors and document
+ * vectors.
+ */
+public class FullKnn {
+
+  private final int dim;
+  private final int topK;
+  private final VectorValues.SearchStrategy searchStrategy;
+  private final boolean quiet;
+
+  public FullKnn(int dim, int topK, VectorValues.SearchStrategy 
searchStrategy, boolean quiet) {
+this.dim = dim;
+this.topK = topK;
+this.searchStrategy = searchStrategy;
+this.quiet = quiet;
+  }
+
+  /** internal object to track KNN calculation for one query */
+  private static class KnnJob {
+public int currDocIndex;
+float[] queryVector;
+float[] currDocVector;
+int queryIndex;
+private LongHeap queue;
+FloatBuffer docVectors;
+VectorValues.SearchStrategy searchStrategy;
+
+public KnnJob(
+int queryIndex, float[] queryVector, int topK, 
VectorValues.SearchStrategy searchStrategy) {
+  this.queryIndex = queryIndex;
+  this.queryVector = queryVector;
+  this.currDocVector = new float[queryVector.length];
+  if (searchStrategy.reversed) {
+queue = LongHeap.create(LongHeap.Order.MAX, topK);
+  } else {
+queue = LongHeap.create(LongHeap.Order.MIN, topK);
+  }
+  this.searchStrategy = searchStrategy;
+}
+
+public void execute() {
+  while (this.docVectors.hasRemaining()) {
+this.docVectors.get(this.currDocVector);

Review comment:
   Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

2021-04-20 Thread GitBox



nitirajrathore commented on a change in pull request #83:
URL: https://github.com/apache/lucene/pull/83#discussion_r616404025



##
File path: lucene/test-framework/src/java/org/apache/lucene/util/FullKnn.java
##
@@ -0,0 +1,254 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.FloatBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Locale;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.lucene.index.VectorValues;
+
+/**
+ * A utility class to calculate the Full KNN / Exact KNN over a set of query 
vectors and document
+ * vectors.
+ */
+public class FullKnn {
+
+  private final int dim;
+  private final int topK;
+  private final VectorValues.SearchStrategy searchStrategy;
+  private final boolean quiet;
+
+  public FullKnn(int dim, int topK, VectorValues.SearchStrategy 
searchStrategy, boolean quiet) {
+this.dim = dim;
+this.topK = topK;
+this.searchStrategy = searchStrategy;
+this.quiet = quiet;
+  }
+
+  /** internal object to track KNN calculation for one query */
+  private static class KnnJob {
+public int currDocIndex;
+float[] queryVector;
+float[] currDocVector;
+int queryIndex;
+private LongHeap queue;
+FloatBuffer docVectors;
+VectorValues.SearchStrategy searchStrategy;
+
+public KnnJob(
+int queryIndex, float[] queryVector, int topK, 
VectorValues.SearchStrategy searchStrategy) {
+  this.queryIndex = queryIndex;
+  this.queryVector = queryVector;
+  this.currDocVector = new float[queryVector.length];
+  if (searchStrategy.reversed) {
+queue = LongHeap.create(LongHeap.Order.MAX, topK);
+  } else {
+queue = LongHeap.create(LongHeap.Order.MIN, topK);
+  }
+  this.searchStrategy = searchStrategy;
+}
+
+public void execute() {
+  while (this.docVectors.hasRemaining()) {
+this.docVectors.get(this.currDocVector);
+float d = this.searchStrategy.compare(this.queryVector, 
this.currDocVector);
+this.queue.insertWithOverflow(encodeNodeIdAndScore(this.currDocIndex, 
d));
+this.currDocIndex++;
+  }
+}
+  }
+
+  /**
+   * computes the exact KNN match for each query vector in queryPath for all 
the document vectors in
+   * docPath
+   *
+   * @param docPath : path to the file containing the float 32 document 
vectors in bytes with
+   * little-endian byte order
+   * @param queryPath : path to the file containing the containing 32-bit 
floating point vectors in
+   * little-endian byte order
+   * @param numThreads : create numThreads to parallelize work
+   * @return : returns an int 2D array ( int matches[][]) of size 'numIters x 
topK'. matches[i] is
+   * an array containing the indexes of the topK most similar document 
vectors to the ith query
+   * vector, and is sorted by similarity, with the most similar vector 
first. Similarity is
+   * defined by the searchStrategy used to construct this FullKnn.
+   * @throws IllegalArgumentException : if topK is greater than number of 
documents in docPath file
+   * IOException : In case of IO exception while reading files.
+   */
+  public int[][] computeNN(Path docPath, Path queryPath, int numThreads) 
throws IOException {
+assert numThreads > 0;
+final int numDocs = (int) (Files.size(docPath) / (dim * Float.BYTES));
+final int numQueries = (int) (Files.size(docPath) / (dim * Float.BYTES));
+
+if (!quiet) {
+  System.out.println(
+  "computing true nearest neighbors of "
+  + numQueries
+  + " target vectors using "
+  + numThreads
+  + " threads.");
+}
+
+try (FileChannel docInput = FileChannel.open(docPath);
+FileChannel queryInput = FileChannel.open(queryPath)) {
+  return doFul

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

2021-04-20 Thread GitBox



nitirajrathore commented on a change in pull request #83:
URL: https://github.com/apache/lucene/pull/83#discussion_r616404214



##
File path: lucene/test-framework/src/java/org/apache/lucene/util/FullKnn.java
##
@@ -0,0 +1,254 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.FloatBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Locale;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.lucene.index.VectorValues;
+
+/**
+ * A utility class to calculate the Full KNN / Exact KNN over a set of query 
vectors and document
+ * vectors.
+ */
+public class FullKnn {
+
+  private final int dim;
+  private final int topK;
+  private final VectorValues.SearchStrategy searchStrategy;
+  private final boolean quiet;
+
+  public FullKnn(int dim, int topK, VectorValues.SearchStrategy 
searchStrategy, boolean quiet) {
+this.dim = dim;
+this.topK = topK;
+this.searchStrategy = searchStrategy;
+this.quiet = quiet;
+  }
+
+  /** internal object to track KNN calculation for one query */
+  private static class KnnJob {
+public int currDocIndex;
+float[] queryVector;
+float[] currDocVector;
+int queryIndex;
+private LongHeap queue;
+FloatBuffer docVectors;
+VectorValues.SearchStrategy searchStrategy;
+
+public KnnJob(
+int queryIndex, float[] queryVector, int topK, 
VectorValues.SearchStrategy searchStrategy) {
+  this.queryIndex = queryIndex;
+  this.queryVector = queryVector;
+  this.currDocVector = new float[queryVector.length];
+  if (searchStrategy.reversed) {
+queue = LongHeap.create(LongHeap.Order.MAX, topK);
+  } else {
+queue = LongHeap.create(LongHeap.Order.MIN, topK);
+  }
+  this.searchStrategy = searchStrategy;
+}
+
+public void execute() {
+  while (this.docVectors.hasRemaining()) {
+this.docVectors.get(this.currDocVector);
+float d = this.searchStrategy.compare(this.queryVector, 
this.currDocVector);
+this.queue.insertWithOverflow(encodeNodeIdAndScore(this.currDocIndex, 
d));
+this.currDocIndex++;
+  }
+}
+  }
+
+  /**
+   * computes the exact KNN match for each query vector in queryPath for all 
the document vectors in
+   * docPath
+   *
+   * @param docPath : path to the file containing the float 32 document 
vectors in bytes with
+   * little-endian byte order
+   * @param queryPath : path to the file containing the containing 32-bit 
floating point vectors in
+   * little-endian byte order
+   * @param numThreads : create numThreads to parallelize work
+   * @return : returns an int 2D array ( int matches[][]) of size 'numIters x 
topK'. matches[i] is
+   * an array containing the indexes of the topK most similar document 
vectors to the ith query
+   * vector, and is sorted by similarity, with the most similar vector 
first. Similarity is
+   * defined by the searchStrategy used to construct this FullKnn.
+   * @throws IllegalArgumentException : if topK is greater than number of 
documents in docPath file
+   * IOException : In case of IO exception while reading files.
+   */
+  public int[][] computeNN(Path docPath, Path queryPath, int numThreads) 
throws IOException {
+assert numThreads > 0;
+final int numDocs = (int) (Files.size(docPath) / (dim * Float.BYTES));
+final int numQueries = (int) (Files.size(docPath) / (dim * Float.BYTES));
+
+if (!quiet) {
+  System.out.println(
+  "computing true nearest neighbors of "
+  + numQueries
+  + " target vectors using "
+  + numThreads
+  + " threads.");
+}
+
+try (FileChannel docInput = FileChannel.open(docPath);
+FileChannel queryInput = FileChannel.open(queryPath)) {
+  return doFul

[GitHub] [lucene] iverase opened a new pull request #98: LUCENE-9047: Adapt big endian dependent code to work in little endian.

2021-04-20 Thread GitBox



iverase opened a new pull request #98:
URL: https://github.com/apache/lucene/pull/98


   In preparation for changing the Directory API endianness, we need to adapt 
some parts o the code which they won't work in a little endian world.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

2021-04-20 Thread GitBox



nitirajrathore commented on a change in pull request #83:
URL: https://github.com/apache/lucene/pull/83#discussion_r616471687



##
File path: lucene/test-framework/src/java/org/apache/lucene/util/FullKnn.java
##
@@ -0,0 +1,254 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.util;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.FloatBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Locale;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.lucene.index.VectorValues;
+
+/**
+ * A utility class to calculate the Full KNN / Exact KNN over a set of query 
vectors and document
+ * vectors.
+ */
+public class FullKnn {
+
+  private final int dim;
+  private final int topK;
+  private final VectorValues.SearchStrategy searchStrategy;
+  private final boolean quiet;
+
+  public FullKnn(int dim, int topK, VectorValues.SearchStrategy 
searchStrategy, boolean quiet) {
+this.dim = dim;
+this.topK = topK;
+this.searchStrategy = searchStrategy;
+this.quiet = quiet;
+  }
+
+  /** internal object to track KNN calculation for one query */
+  private static class KnnJob {
+public int currDocIndex;
+float[] queryVector;
+float[] currDocVector;
+int queryIndex;
+private LongHeap queue;
+FloatBuffer docVectors;
+VectorValues.SearchStrategy searchStrategy;
+
+public KnnJob(
+int queryIndex, float[] queryVector, int topK, 
VectorValues.SearchStrategy searchStrategy) {
+  this.queryIndex = queryIndex;
+  this.queryVector = queryVector;
+  this.currDocVector = new float[queryVector.length];
+  if (searchStrategy.reversed) {
+queue = LongHeap.create(LongHeap.Order.MAX, topK);
+  } else {
+queue = LongHeap.create(LongHeap.Order.MIN, topK);
+  }
+  this.searchStrategy = searchStrategy;
+}
+
+public void execute() {
+  while (this.docVectors.hasRemaining()) {
+this.docVectors.get(this.currDocVector);
+float d = this.searchStrategy.compare(this.queryVector, 
this.currDocVector);
+this.queue.insertWithOverflow(encodeNodeIdAndScore(this.currDocIndex, 
d));
+this.currDocIndex++;
+  }
+}
+  }
+
+  /**
+   * computes the exact KNN match for each query vector in queryPath for all 
the document vectors in
+   * docPath
+   *
+   * @param docPath : path to the file containing the float 32 document 
vectors in bytes with
+   * little-endian byte order
+   * @param queryPath : path to the file containing the containing 32-bit 
floating point vectors in
+   * little-endian byte order
+   * @param numThreads : create numThreads to parallelize work
+   * @return : returns an int 2D array ( int matches[][]) of size 'numIters x 
topK'. matches[i] is
+   * an array containing the indexes of the topK most similar document 
vectors to the ith query
+   * vector, and is sorted by similarity, with the most similar vector 
first. Similarity is
+   * defined by the searchStrategy used to construct this FullKnn.
+   * @throws IllegalArgumentException : if topK is greater than number of 
documents in docPath file
+   * IOException : In case of IO exception while reading files.
+   */
+  public int[][] computeNN(Path docPath, Path queryPath, int numThreads) 
throws IOException {
+assert numThreads > 0;
+final int numDocs = (int) (Files.size(docPath) / (dim * Float.BYTES));
+final int numQueries = (int) (Files.size(docPath) / (dim * Float.BYTES));
+
+if (!quiet) {
+  System.out.println(
+  "computing true nearest neighbors of "
+  + numQueries
+  + " target vectors using "
+  + numThreads
+  + " threads.");
+}
+
+try (FileChannel docInput = FileChannel.open(docPath);
+FileChannel queryInput = FileChannel.open(queryPath)) {
+  return doFul

[GitHub] [lucene] iverase merged pull request #98: LUCENE-9047: Adapt big endian dependent code to work in little endian.

2021-04-20 Thread GitBox



iverase merged pull request #98:
URL: https://github.com/apache/lucene/pull/98


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9047) Directory APIs should be little endian

2021-04-20 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325616#comment-17325616
 ] 

ASF subversion and git services commented on LUCENE-9047:
-

Commit 5592d582b856c99df4839172b40733c18c6094e9 in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=5592d58 ]

LUCENE-9047: Adapt big endian dependent code to work in little endian



> Directory APIs should be little endian
> --
>
> Key: LUCENE-9047
> URL: https://issues.apache.org/jira/browse/LUCENE-9047
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: main (9.0)
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> We started discussing this on LUCENE-9027. It's a shame that we need to keep 
> reversing the order of bytes all the time because our APIs are big endian 
> while the vast majority of architectures are little endian.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nitirajrathore commented on pull request #83: LUCENE-9798 : Fix looping bug when calculating full KNN results in KnnGraphTester

2021-04-20 Thread GitBox



nitirajrathore commented on pull request #83:
URL: https://github.com/apache/lucene/pull/83#issuecomment-823139325


   > > Fixed the bug and also made the code to execute parallely, so as to take 
less time for large document vector files.
   > 
   > please, these need to be 2 separate issues.
   
   Sure @rmuir , I have reverted the changes for parallel execution from this 
PR. I will address that separately in a different PR and issue.
   
   @msokolov : I will address issues related to parallel execution code in a 
separate PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] janhoy commented on pull request #84: LUCENE-9929 NorwegianNormalizationFilter

2021-04-20 Thread GitBox



janhoy commented on pull request #84:
URL: https://github.com/apache/lucene/pull/84#issuecomment-823162649


   Ready for a new review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-20 Thread GitBox



neoremind commented on pull request #91:
URL: https://github.com/apache/lucene/pull/91#issuecomment-823172724


   @jpountz Good advice! Before that I am still struggling where to propagate 
this config up to the index builder layer. 
   I will give it a try, the first thing comes up my mind is to bring up a new 
`prepare` method, in which it will scan all docid from i to j to see if they 
are increasing. I will experiment on this, if the overhead is small enough, 
then it is worthwhile to sort without docid.
   One more question, are there any places where doc id is not added 
increasingly? I mean the source code, not test cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9932) Performance improvement for BKD index building

2021-04-20 Thread neoremind (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

neoremind updated LUCENE-9932:
--
Description: 
In BKD index building, the input bytes must be sorted before calling BKD writer 
related API. The sorting method leverages MSB Radix Sort algorithm, and the 
comparing method takes both the bytes itself and the DocId, but in real cases, 
DocIds are usually monotonically increasing. This could yield one possible 
performance enhancer. I found this enhancement when I dig into one performance 
issue in our system. Then I research on the possible solution.

DocId is usually increased by one when building index in a thread-safe way, by 
assuming such condition, the comparing method can eliminate the unnecessary 
comparing input - DocId, only leave the bytes itself to compare. In order to do 
so, MSB radix sorting and its fallback sorting method must be *stable*, so that 
when elements are the same, the sorting method maintains its original order 
when added, which makes DocId still monotonically increasing. To make MSB Radix 
Sort stable, it needs a trivial update; to make fallback sort table, use merge 
sort instead of quick sort. Meanwhile, there should introduce a switch which is 
able to turn the stable option on or off.

To validate how much performance could be gained. I make a benchmark taking 
down only the time elapsed in _MutablePointsReaderUtils.sort_ stage.

*Test environment:* 
 MacBook Pro (Retina, 15-inch, Mid 2015), 2.2 GHz Intel Core i7, 16 GB 1600 MHz 
DDR3

*Java version:*
 java version "1.8.0_161"
 Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

*Testcase:*
 bytesPerDim = [1, 2, 3, 4, 8, 16, 32]
 dim = 1
 doc num = 2,000,000
 warm up 5 time, run 10 times to calculate average time used.

*Result:*

 
||bytesPerDim\scenario||disable sort doc id (PR branch)||enable sort doc id 
(master branch)||
|1|30989.594 us|1151149.9 us|
|2|313469.47 us|1115595.1 us|
|3|844617.8 us|1465465.1 us|
|4|1350946.8 us|1465465.1 us|
|8|1344814.6 us|1458115.5 us|
|16|1344516.6 us|1459849.6 us|
|32|1386847.8 us|1583097.5 us|

!benchmark_data.png|width=580,height=283!

Result shows that, by disabling sort DocId, sorting runs 1.73x to 37x faster 
when there are many duplicate bytes (bytesPerDim = 1 or 2 or 3). When data 
cardinality is high (bytesPerDim >= 4, test cases will generate random bytes 
which are more scatter, not likely to be duplicate), the performance does not 
go backward, still a little better.

In conclusion, in the end to end process for building BKD index, which relies 
on BKDWriter for some data types, performance could be better by ignoring DocId 
if they are already monotonically increasing.

  was:
In BKD index building, the input bytes must be sorted before calling BKD writer 
related API. The sorting method leverages MSB Radix Sort algorithm, and the 
comparing method takes both the bytes itself and the DocId, but in real cases, 
DocIds are usually monotonically increasing. This could yield one possible 
performance enhancer. I found this enhancement when I dig into one performance 
issue in our system. Then I research on the possible solution.

DocId is usually increased by one when building index in a thread-safe way, by 
assuming such condition, the comparing method can eliminate the unnecessary 
comparing input - DocId, only leave the bytes itself to compare. In order to do 
so, MSB radix sorting and its fallback sorting method must be *stable*, so that 
when elements are the same, the sorting method maintains its original order 
when added, which makes DocId still monotonically increasing. To make MSB Radix 
Sort stable, it needs a trivial update; to make fallback sort table, use merge 
sort instead of quick sort. Meanwhile, there should introduce a switch which is 
able to turn the stable option on or off.

To validate how much performance could be gained. I make a benchmark taking 
down only the time elapsed in _MutablePointsReaderUtils.sort_ stage.

*Test environment:* 
 MacBook Pro (Retina, 15-inch, Mid 2015), 2.2 GHz Intel Core i7, 16 GB 1600 MHz 
DDR3

*Java version:*
 java version "1.8.0_161"
 Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
 Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

*Testcase:*
 bytesPerDim = [1, 2, 3, 4, 8, 16, 32]
 dim = 1
 doc num = 2,000,000
 warm up 5 time, run 10 times to calculate average time used.

*Result:*

 
||bytesPerDim\scenario||disable sort doc id (PR branch)||enable sort doc id 
(master branch)||
|1|30989.594 us|1151149.9 us|
|2|313469.47 us|1115595.1 us|
|3|844617.8 us|1465465.1 us|
|4|1350946.8 us|1465465.1 us|
|8|1344814.6 us|1458115.5 us|
|16|1344516.6 us|1459849.6 us|
|32|1386847.8 us|1583097.5 us|

!benchmark_data.png|width=580,height=283!

 When there are many duplicate bytes (bytesPerDim = 1 or 2 or 3) which means 
data cardinality is

[GitHub] [lucene] pawel-bugalski-dynatrace opened a new pull request #99: LUCENE-9869 allow for configuring a custom cache purge scheduler in Monitor (aka Luwak)

2021-04-20 Thread GitBox

pawel-bugalski-dynatrace opened a new pull request #99:
URL: https://github.com/apache/lucene/pull/99

# Description

By default org.apache.lucene.monitor.Monitor will create a new thread per
instance
in order to schedule its cache purge periodic task. This is not always the
desired
behaviour as for example one could create a large number of Monitor
instances in a
single JVM to separate business domains. In such case it would be
counterproductive
to create a new thread for each instance of Monitor. Instead through
introduction
of PurgeScheduler interface one can now implement its own scheduling
strategy.

# Solution

Please provide a short description of the approach taken to implement your
solution.

# Tests

Used this new API in an external codebase to confirm its proper behaviour
and usefulness.

# Checklist

Please review the following and check all that apply:

- [x] I have reviewed the guidelines for [How to
Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code
conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request
title.
- [x] I have given Lucene maintainers
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
to contribute to my PR branch. (optional but recommended)
- [x] I have developed this patch against the `main` branch.
- [x] I have run `./gradlew check`.
- [ ] I have added tests for my changes.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-20 Thread GitBox



jpountz commented on pull request #91:
URL: https://github.com/apache/lucene/pull/91#issuecomment-823209788


   > I will give it a try, the first thing comes up my mind is to bring up a 
new prepare method,
   
   One idea I had in mind was to create a new class, something like 
`StableMSBRadixSorter` that would extend `MSBRadixSorter` to:
- add the two `assign` and `finalizeAssign` methods that you currently 
added to `Sorter`,
- override the way data gets rearranged to guarantee stability,
- change the fallback sorter,
- modify `radixSort(int,int,int,int)` to check whether data is already 
sorted before computing the common prefix length and the histogram of the 
leading bytes.
   
   > One more question, are there any places where doc id is not added 
increasingly?
   
   I don't remember how we deal with it, but we should check how this 
optimization plays with index sorting, since we would renumber doc IDs at flush 
time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] mocobeta commented on a change in pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

2021-04-20 Thread GitBox



mocobeta commented on a change in pull request #90:
URL: https://github.com/apache/lucene/pull/90#discussion_r616743197



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java
##
@@ -140,24 +135,48 @@
  * 
  *   Header is a {@link CodecUtil#writeHeader CodecHeader} storing the 
version information for
  *   the BlockTree implementation.
- *   DirOffset is a pointer to the FieldSummary section.
  *   DocFreq is the count of documents which contain the term.
  *   TotalTermFreq is the total number of occurrences of the term. This is 
encoded as the
  *   difference between the total number of occurrences and the DocFreq.
+ *   PostingsHeader and TermMetadata are plugged into by the specific 
postings implementation:
+ *   these contain arbitrary per-file data (such as parameters or 
versioning information) and
+ *   per-term data (such as pointers to inverted files).
+ *   For inner nodes of the tree, every entry will steal one bit to mark 
whether it points to
+ *   child nodes(sub-block). If so, the corresponding TermStats and 
TermMetaData are omitted
+ * 
+ *
+ * 
+ *
+ * Term Metadata
+ *
+ * The .tmd file contains the list of term metadata (such as FST index 
metadata) and field level
+ * statistics (such as sum of total term freq).
+ *
+ * 
+ *   TermsMeta (.tmd) --> Header, NumFields, 
NumFields,
+ *   TermIndexLength, TermDictLength, Footer
+ *   FieldStats --> FieldNumber, NumTerms, RootCodeLength, 
ByteRootCodeLength,
+ *   SumTotalTermFreq?, SumDocFreq, DocCount, MinTerm, MaxTerm, 
IndexStartFP, FSTHeader,

Review comment:
   I'm actually not the author of the line (I just moved it from the above 
section to here), but the specification seems to be correct to me.
   
https://github.com/apache/lucene/blob/5592d582b856c99df4839172b40733c18c6094e9/lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java#L1108-L




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] mocobeta merged pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

2021-04-20 Thread GitBox



mocobeta merged pull request #90:
URL: https://github.com/apache/lucene/pull/90


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9353) Move metadata of the terms dictionary to its own file

2021-04-20 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325840#comment-17325840
 ] 

ASF subversion and git services commented on LUCENE-9353:
-

Commit 5f5d1949e9296eb9c8a57c4f2f1b325ffadabaf8 in lucene's branch 
refs/heads/main from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=5f5d194 ]

LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter (#90)



> Move metadata of the terms dictionary to its own file
> -
>
> Key: LUCENE-9353
> URL: https://issues.apache.org/jira/browse/LUCENE-9353
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.6
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently opening a terms index requires jumping to the end of the terms 
> index and terms dictionaries to decode some metadata such as sumTtf or file 
> pointers where information for a given field is located. It'd be nicer to 
> have it in a separate file, which would also have the benefit of letting us 
> verify checksums for this part of the content.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on a change in pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

2021-04-20 Thread GitBox



jpountz commented on a change in pull request #90:
URL: https://github.com/apache/lucene/pull/90#discussion_r616828034



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java
##
@@ -140,24 +135,48 @@
  * 
  *   Header is a {@link CodecUtil#writeHeader CodecHeader} storing the 
version information for
  *   the BlockTree implementation.
- *   DirOffset is a pointer to the FieldSummary section.
  *   DocFreq is the count of documents which contain the term.
  *   TotalTermFreq is the total number of occurrences of the term. This is 
encoded as the
  *   difference between the total number of occurrences and the DocFreq.
+ *   PostingsHeader and TermMetadata are plugged into by the specific 
postings implementation:
+ *   these contain arbitrary per-file data (such as parameters or 
versioning information) and
+ *   per-term data (such as pointers to inverted files).
+ *   For inner nodes of the tree, every entry will steal one bit to mark 
whether it points to
+ *   child nodes(sub-block). If so, the corresponding TermStats and 
TermMetaData are omitted
+ * 
+ *
+ * 
+ *
+ * Term Metadata
+ *
+ * The .tmd file contains the list of term metadata (such as FST index 
metadata) and field level
+ * statistics (such as sum of total term freq).
+ *
+ * 
+ *   TermsMeta (.tmd) --> Header, NumFields, 
NumFields,
+ *   TermIndexLength, TermDictLength, Footer
+ *   FieldStats --> FieldNumber, NumTerms, RootCodeLength, 
ByteRootCodeLength,
+ *   SumTotalTermFreq?, SumDocFreq, DocCount, MinTerm, MaxTerm, 
IndexStartFP, FSTHeader,

Review comment:
   Woops I had misread!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

2021-04-20 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326127#comment-17326127
 ] 

Mike Drob commented on LUCENE-9334:
---

I think this is causing SOLR-15360, but I can't say for certain. If there's any 
chance that somebody can come over and help us understand a bit more, that 
would be much appreciated.

> Require consistency between data-structures on a per-field basis
> 
>
> Key: LUCENE-9334
> URL: https://issues.apache.org/jira/browse/LUCENE-9334
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: main (9.0)
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Follow-up of 
> https://lists.apache.org/thread.html/r747de568afd7502008c45783b74cc3aeb31dab8aa60fcafaf65d5431%40%3Cdev.lucene.apache.org%3E.
> We would like to start requiring consitency across data-structures on a 
> per-field basis in order to make it easier to do the right thing by default: 
> range queries can run faster if doc values are enabled, sorted queries can 
> run faster if points by indexed, etc.
> This would be a big change, so it should be rolled out in a major.
> Strict validation is tricky to implement, but we should still implement 
> best-effort validation:
>  - Documents all use the same data-structures, e.g. it is illegal for a 
> document to only enable points and another document to only enable doc values,
>  - When possible, check whether values are consistent too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

2021-04-20 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326128#comment-17326128
 ] 

David Smiley commented on LUCENE-9334:
--

Mike, the issue you just filed is effectively a duplicate of SOLR-15356 which I 
spent time debugging for that one.  Already solved :-). I sent a message to the 
dev list about this too the other day.

> Require consistency between data-structures on a per-field basis
> 
>
> Key: LUCENE-9334
> URL: https://issues.apache.org/jira/browse/LUCENE-9334
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: main (9.0)
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Follow-up of 
> https://lists.apache.org/thread.html/r747de568afd7502008c45783b74cc3aeb31dab8aa60fcafaf65d5431%40%3Cdev.lucene.apache.org%3E.
> We would like to start requiring consitency across data-structures on a 
> per-field basis in order to make it easier to do the right thing by default: 
> range queries can run faster if doc values are enabled, sorted queries can 
> run faster if points by indexed, etc.
> This would be a big change, so it should be rolled out in a major.
> Strict validation is tricky to implement, but we should still implement 
> best-effort validation:
>  - Documents all use the same data-structures, e.g. it is illegal for a 
> document to only enable points and another document to only enable doc values,
>  - When possible, check whether values are consistent too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

2021-04-20 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326129#comment-17326129
 ] 

Mike Drob commented on LUCENE-9334:
---

Thanks David! I tried searching the dev list and for existing issues, but it 
looks like I started with the other end of the failing tests than you did. 
Thanks for being proactive!

> Require consistency between data-structures on a per-field basis
> 
>
> Key: LUCENE-9334
> URL: https://issues.apache.org/jira/browse/LUCENE-9334
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: main (9.0)
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Follow-up of 
> https://lists.apache.org/thread.html/r747de568afd7502008c45783b74cc3aeb31dab8aa60fcafaf65d5431%40%3Cdev.lucene.apache.org%3E.
> We would like to start requiring consitency across data-structures on a 
> per-field basis in order to make it easier to do the right thing by default: 
> range queries can run faster if doc values are enabled, sorted queries can 
> run faster if points by indexed, etc.
> This would be a big change, so it should be rolled out in a major.
> Strict validation is tricky to implement, but we should still implement 
> best-effort validation:
>  - Documents all use the same data-structures, e.g. it is illegal for a 
> document to only enable points and another document to only enable doc values,
>  - When possible, check whether values are consistent too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] zacharymorn opened a new pull request #56: Add Zach Chen to committer list

2021-04-20 Thread GitBox



zacharymorn opened a new pull request #56:
URL: https://github.com/apache/lucene-site/pull/56


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] zacharymorn commented on pull request #56: Add Zach Chen to committer list

2021-04-20 Thread GitBox



zacharymorn commented on pull request #56:
URL: https://github.com/apache/lucene-site/pull/56#issuecomment-823748361


   Thanks Michael! I think I may still not have write access though.
   
   https://user-images.githubusercontent.com/2986273/115491953-bc2cd600-a215-11eb-8ff2-7e394946cd8f.png";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Jawnnypoo opened a new pull request #100: Update gradle to 6.8.3

2021-04-20 Thread GitBox



Jawnnypoo opened a new pull request #100:
URL: https://github.com/apache/lucene/pull/100


   7.0 was quite a tough upgrade path, but maybe someday soon!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] HoustonPutman commented on pull request #56: Add Zach Chen to committer list

2021-04-20 Thread GitBox



HoustonPutman commented on pull request #56:
URL: https://github.com/apache/lucene-site/pull/56#issuecomment-823772980


   Have you linked your ASF and Github accounts here?
   
   https://gitbox.apache.org/setup/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] zacharymorn commented on pull request #56: Add Zach Chen to committer list

2021-04-20 Thread GitBox



zacharymorn commented on pull request #56:
URL: https://github.com/apache/lucene-site/pull/56#issuecomment-823819146


   > Have you linked your ASF and Github accounts here?
   > 
   > https://gitbox.apache.org/setup/
   
   Ah thanks @HoustonPutman for the pointer! I must missed it earlier. I just 
linked them up and now able to see the merge PR button. Appreciate your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] zacharymorn merged pull request #56: Add Zach Chen to committer list

2021-04-20 Thread GitBox



zacharymorn merged pull request #56:
URL: https://github.com/apache/lucene-site/pull/56


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] iverase merged pull request #97: LUCENE-9907: Move PackedInts#getReaderNoHeader() to backwards codec

[jira] [Commented] (LUCENE-9907) Remove dependency on PackedInts#getReader() in all current codecs

[jira] [Resolved] (LUCENE-9907) Remove dependency on PackedInts#getReader() in all current codecs

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

[GitHub] [lucene] iverase opened a new pull request #98: LUCENE-9047: Adapt big endian dependent code to work in little endian.

[GitHub] [lucene] nitirajrathore commented on a change in pull request #83: LUCENE-9798 : Fix looping bug and made Full Knn calculation parallelizable

[GitHub] [lucene] iverase merged pull request #98: LUCENE-9047: Adapt big endian dependent code to work in little endian.

[jira] [Commented] (LUCENE-9047) Directory APIs should be little endian

[GitHub] [lucene] nitirajrathore commented on pull request #83: LUCENE-9798 : Fix looping bug when calculating full KNN results in KnnGraphTester

[GitHub] [lucene] janhoy commented on pull request #84: LUCENE-9929 NorwegianNormalizationFilter

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

[jira] [Updated] (LUCENE-9932) Performance improvement for BKD index building

[GitHub] [lucene] pawel-bugalski-dynatrace opened a new pull request #99: LUCENE-9869 allow for configuring a custom cache purge scheduler in Monitor (aka Luwak)

[GitHub] [lucene] jpountz commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

[GitHub] [lucene] mocobeta commented on a change in pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

[GitHub] [lucene] mocobeta merged pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

[jira] [Commented] (LUCENE-9353) Move metadata of the terms dictionary to its own file

[GitHub] [lucene] jpountz commented on a change in pull request #90: LUCENE-9353: revise format documentation of Lucene90BlockTreeTermsWriter

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

[jira] [Commented] (LUCENE-9334) Require consistency between data-structures on a per-field basis

[GitHub] [lucene-site] zacharymorn opened a new pull request #56: Add Zach Chen to committer list

[GitHub] [lucene-site] zacharymorn commented on pull request #56: Add Zach Chen to committer list

[GitHub] [lucene] Jawnnypoo opened a new pull request #100: Update gradle to 6.8.3

[GitHub] [lucene-site] HoustonPutman commented on pull request #56: Add Zach Chen to committer list

[GitHub] [lucene-site] zacharymorn commented on pull request #56: Add Zach Chen to committer list

[GitHub] [lucene-site] zacharymorn merged pull request #56: Add Zach Chen to committer list

29 matches

Site Navigation

Mail list logo

Footer information