date:20220629

[GitHub] [lucene] jpountz commented on a diff in pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



jpountz commented on code in PR #972:
URL: https://github.com/apache/lucene/pull/972#discussion_r909281863


##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;

Review Comment:
   Nit: it's inconsistent that `upTo` gets initialized here while `doc` is 
initialized in the constructor.



##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;
+
+  // heap of scorers ordered by doc ID
+  private final DisiPriorityQueue essentialsScorers;
+  // list of scorers ordered by maxScore
+  private final LinkedList maxScoreSortedEssentialScorers;
+
+  private final DisiWrapper[] allScorers;
+
+  // sum of max scores of scorers in nonEssentialScorers list
+  private float nonEssentialMaxScoreSum;
+
+  private long cost;
+
+  private final MaxScoreSumPropagator maxScoreSumPropagator;
+
+  // scaled min competitive score
+  private float minCompetitiveScore = 0;
+
+  private int cachedScoredDoc = -1;
+  private float cachedScore = 0;
+
+  /**
+   * Constructs a Scorer that scores doc based on Block-Max-Maxscore (BMM) 
algorithm
+   * http://engineering.nyu.edu/~suel/papers/bmm.pdf . This algorithm has 
lower overhead compared to
+   * WANDScorer, and could be used for simple disjunction queries.
+   *
+   * @param weight The weight to be used.
+   * @param scorers The sub scorers this Scorer should iterate on for optional 
clauses
+   */
+  public BlockMaxMaxscoreScorer(Weight weight, List scorers) throws 
IOException {
+super(weight);
+
+this.doc = -1;
+this.allScorers = new DisiWrapper[scorers.size()];
+this.essentialsScorers = new DisiPriorityQueue(scorers.size());
+this.maxScoreSortedEssentialScorers = new LinkedList<>();
+
+long cost = 0;
+for (int i = 0; i < scorers.size(); i++) {
+  DisiWrapper w = new DisiWrapper(scorers.get(i));
+  cost += w.cost;
+  allScorers[i] = w;
+}
+
+this.cost = cost;
+maxScoreSumPropagator = new MaxScoreSumPropagator(scorers);
+  }
+
+  @Override
+  public DocIdSetIterator iterator() {
+// twoPhaseIterator needed to honor scorer.setMinCompetitiveScore guarantee
+return TwoPhaseIterator.asDocIdSetIterator(twoPhaseIterator());
+  }
+
+  @Override
+  public TwoPhaseIterator twoPhaseIterator() {
+DocIdSetIterator approximation =
+new D

[jira] [Created] (LUCENE-10630) error: 'gmtime' was not declared in this scope; did you mean 'getTime'?

2022-06-29 Thread Jira

Title: Message Title


 
 
 
 

 
 
 

 
   
 Martin Liška created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10630  
 
 
  error: 'gmtime' was not declared in this scope; did you mean 'getTime'?   
 

  
 
 
 
 

 
Issue Type: 
  Bug  
 
 
Assignee: 
 Unassigned  
 
 
Created: 
 29/Jun/22 08:57  
 
 
Priority: 
  Major  
 
 
Reporter: 
 Martin Liška  
 

  
 
 
 
 

 
 Happens with GCC 13 or with the current GCC-12 branch:     

 

cd /home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/build/src/core && /usr/bin/c++ -DMAKE_CLUCENE_CORE_LIB -Dclucene_core_EXPORTS -I/home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/src/shared -I/home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/build/src/shared -I/home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/src/core -O2 -Wall -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -flto=auto -g -fPIC -ansi -O2 -g -DNDEBUG -fPIC    -D_REENTRANT -D_UCS2 -D_UNICODE -MD -MT src/core/CMakeFiles/clucene-core.dir/CLucene/queryParser/MultiFieldQueryParser.o -MF CMakeFiles/clucene-core.dir/CLucene/queryParser/MultiFieldQueryParser.o.d -o CMakeFiles/clucene-core.dir/CLucene/queryParser/MultiFieldQueryParser.o -c /home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/src/core/CLucene/queryParser/MultiFieldQueryParser.cpp
...
/home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/src/core/CLucene/document/DateTools.cpp: In static member function 'static void lucene::document::DateTools::timeToString(int64_t, Resolution, TCHAR*, size_t)':
/home/abuild/rpmbuild/BUILD/clucene-core-2.3.3.4/src/core/CLucene/document/DateTools.cpp:26:19: error: 'gmtime' was not declared in this scope; did you mean 'getTime'?
   26 |         tm *ptm = gmtime(&secs);
      |                   ^~
      |                   getTime
 

 It's about missing system header `time.h`, please include it.

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I asked INFRA to create a new repo for archiving attachments (INFRA-23426) and was guided to the toolset for self-service purposes. Seems a new repo can be created here. I can't fill a mandatory field "Project" (I see "lucene" in the list but can't select it). https://gitbox.apache.org/boxer/?action=""> I'm not sure this is due to my account role (committer). Could anyone take a look at the tool - and if possible, create a new repository named lucene-jira-archive?  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Dawid Weiss (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Dawid Weiss commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Done. 

 

Your repository has been created and will be available for use within a few minutes.
Your project is available on gitbox at: https://gitbox.apache.org/repos/asf/lucene-jira-archive.git
Your project is available on GitHub at: https://github.com/apache/lucene-jira-archive.git
User permissions should be set up within the next five minutes. If not, please let us know at: us...@infra.apache.org  

  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Hi Tomoko, I am able to create repos:  Will now create the issue.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) 

 If image attachments aren't displayed, see 
this article.

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10557  
 
 
  Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
Change By: 
 Uwe Schindler  
 
 
Attachment: 
 screenshot-1.png  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler edited a comment on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Hi Tomoko, I am able to create repos:!screenshot-1.png|width=720!Will now create the  issue  repo .  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 LOL. I got message that it already exists.   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Dawid Weiss (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Dawid Weiss commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-29 Thread ASF subversion and git services (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 ASF subversion and git services commented on  LUCENE-10593  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: VectorSimilarityFunction reverse removal   
 

  
 
 
 
 

 
 Commit b3b7098cd9636c5ad2516055f768dd29b795a05d in lucene's branch refs/heads/branch_9x from Alessandro Benedetti [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=b3b7098cd96 ] LUCENE-10593: VectorSimilarityFunction reverse removal (#926) 
 
Vector Similarity Function reverse property removed 
 
 
NeighborQueue tie-breaking fixed (node id + node score encoding) 
 
 
NeighborQueue readability refactor 
 
 
BoundChecker removal (now it's only in backward-codecs) 
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Updated] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-29 Thread Alessandro Benedetti (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Alessandro Benedetti updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10593  
 
 
  VectorSimilarityFunction reverse removal   
 

  
 
 
 
 

 
Change By: 
 Alessandro Benedetti  
 
 
Fix Version/s: 
 9.3  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Assigned] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-29 Thread Alessandro Benedetti (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Alessandro Benedetti assigned an issue to Alessandro Benedetti  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10593  
 
 
  VectorSimilarityFunction reverse removal   
 

  
 
 
 
 

 
Change By: 
 Alessandro Benedetti  
 
 
Assignee: 
 Alessandro Benedetti  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Resolved] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-29 Thread Alessandro Benedetti (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Alessandro Benedetti resolved as Fixed  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10593  
 
 
  VectorSimilarityFunction reverse removal   
 

  
 
 
 
 

 
Change By: 
 Alessandro Benedetti  
 
 
Resolution: 
 Fixed  
 
 
Status: 
 Open Resolved  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] alessandrobenedetti commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-29 Thread GitBox



alessandrobenedetti commented on PR #926:
URL: https://github.com/apache/lucene/pull/926#issuecomment-1169764171

   Done, everything is merged and backported to 9.x, thanks for your support!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Uwe Schindler Dawid Weiss thank you both! I was able to push the first commit. https://github.com/apache/lucene-jira-archive Looks like watchers are inherited from apache/lucene ...  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I set https://github.com/apache/lucene-jira-archive/blob/main/.asf.yaml not to send notifications to mail groups. Looks like all updates in the repository are still noticed in d...@lucene.apache.org (initial setting when creating the repo?). Could anybody mute this?  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Dawid Weiss (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Dawid Weiss updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10557  
 
 
  Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
Change By: 
 Dawid Weiss  
 
 
Attachment: 
 image-2022-06-29-13-36-57-365.png  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Dawid Weiss (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Dawid Weiss commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 https://gitbox.apache.org/schemes.cgi?lucene-jira-archive   Something seems wrong. According to https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features, the update should be approved via an e-mail sent to private mailing list - I don't see any such email yet.   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) 

 If image attachments aren't displayed, see 
this article.

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Thanks for the information. Something in the automation system could be delayed? I'll check the status page later again.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 

Looks like all updates in the repository are still noticed in d...@lucene.apache.org (initial setting when creating the repo?). Could anybody mute this?
 d...@lucene.apache.org and comm...@lucene.apache.org were selected as default during creating repo (see my screenshot above). Actually the PR/issue list should have been issues@lucene.apache.org, but for this case it should be completely silent. I think there seems to be some delay, maybe ask on Slack's infra channel.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Michael McCandless Just for your information, we now have a public ASF repository https://github.com/apache/lucene-jira-archive for the migration and I pushed the migration scripts there to develop/archive it under Apache. I also opened a few issues for it. 

Tomoko Uchida could you share the source code of the import tool you are working on? Maybe post it in a personal public GitHub repo? We call can try to make PRs / review
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] gsmiller opened a new pull request, #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



gsmiller opened a new pull request, #995:
URL: https://github.com/apache/lucene/pull/995

   This migrates the remaining production code iteration to use 
`SSDV#docValueCount` for iteration, getting us closer to removing support for 
`NO_MORE_ORDS` in `SSDV#nextOrd`. Test-related code still needs to get updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



gsmiller commented on PR #995:
URL: https://github.com/apache/lucene/pull/995#issuecomment-1169931258

   Hmm, something's busted with my changes to `CheckIndex`. Will dig in shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-10631) Consolidate java version numbers in one place and reuse them across build parts

2022-06-29 Thread Dawid Weiss (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Dawid Weiss created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10631  
 
 
  Consolidate java version numbers in one place and reuse them across build parts   
 

  
 
 
 
 

 
Issue Type: 
  Sub-task  
 
 
Assignee: 
 Unassigned  
 
 
Created: 
 29/Jun/22 12:43  
 
 
Priority: 
  Minor  
 
 
Reporter: 
 Dawid Weiss  
 

  
 
 
 
 

 
 [R. Muir/ mailing list discussions] Ideally we could consolidate a lot of them in a simple .properties file that contains the min/max major version numbers. could be then sucked in by: 
 
gradle logic 
java logic such as checks done in WrapperDownloader 
bash logic such as error messaging in ./gradlew.sh 
python smoketester logic? 
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment

[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing

2022-06-29 Thread Mayya Sharipova (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Mayya Sharipova commented on  LUCENE-10592  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Should we build HNSW graph on the fly during indexing   
 

  
 
 
 
 

 
 PR: https://github.com/apache/lucene/pull/992  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10630) error: 'gmtime' was not declared in this scope; did you mean 'getTime'?

2022-06-29 Thread Alan Woodward (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Alan Woodward commented on  LUCENE-10630  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: error: 'gmtime' was not declared in this scope; did you mean 'getTime'?   
 

  
 
 
 
 

 
 This is the issue tracker for the Apache Lucene project, which is written in Java.  I think you want http://clucene.sourceforge.net/, which is the website for the c++ port.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10630) error: 'gmtime' was not declared in this scope; did you mean 'getTime'?

2022-06-29 Thread Jira

Title: Message Title


 
 
 
 

 
 
 

 
   
 Martin Liška commented on  LUCENE-10630  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: error: 'gmtime' was not declared in this scope; did you mean 'getTime'?   
 

  
 
 
 
 

 
 Oh, you are right. The C++ port bug lives here: https://sourceforge.net/p/clucene/bugs/235/  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Resolved] (LUCENE-10630) error: 'gmtime' was not declared in this scope; did you mean 'getTime'?

2022-06-29 Thread Alan Woodward (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Alan Woodward resolved as Invalid  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10630  
 
 
  error: 'gmtime' was not declared in this scope; did you mean 'getTime'?   
 

  
 
 
 
 

 
Change By: 
 Alan Woodward  
 
 
Resolution: 
 Invalid  
 
 
Status: 
 Open Resolved  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 According to the infra, you cannot set your personal email address in the repos' notification setting. I changed the address to issues@.  https://github.com/apache/lucene-jira-archive/blob/main/.asf.yaml Would you please see private@ list if the notification mail for review was sent (or will have been sent in shortly) there this time?  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Michael McCandless (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael McCandless commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 

I think I have addressed attachments.
 Woot!  I love seeing the attached patch file rendered inline via GitHub like that (versus downloading to my local disk in Jira)Unable to render embedded object: File (  This is awesome progress – thanks [~tomoko]) not found. 

 
Michael McCandless Just for your information, we now have a public ASF repository https://github.com/apache/lucene-jira-archive for the migration and I pushed the migration scripts there to develop/archive it under Apache. I also opened a few issues for it.  

 YAY!  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] msokolov merged pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-29 Thread GitBox



msokolov merged PR #927:
URL: https://github.com/apache/lucene/pull/927


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] msokolov commented on pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-29 Thread GitBox



msokolov commented on PR #927:
URL: https://github.com/apache/lucene/pull/927#issuecomment-1170060397

   I'll follow up with a CHANGES.txt and backport to 9.x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10151) Add timeout support to IndexSearcher

2022-06-29 Thread ASF subversion and git services (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 ASF subversion and git services commented on  LUCENE-10151  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Add timeout support to IndexSearcher   
 

  
 
 
 
 

 
 Commit af05550ebfe3dc1bc40aeb2318c132a9b12e37a2 in lucene's branch refs/heads/main from Deepika0510 [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=af05550ebfe ] LUCENE-10151: Adding Timeout Support to IndexSearcher (#927) Authored-by: Deepika Sharma   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10151) Add timeout support to IndexSearcher

2022-06-29 Thread ASF subversion and git services (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 ASF subversion and git services commented on  LUCENE-10151  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Add timeout support to IndexSearcher   
 

  
 
 
 
 

 
 Commit 95de554b65bece9697396eeb4a5e78a8352f58d0 in lucene's branch refs/heads/main from Michael Sokolov [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=95de554b65b ] CHANGES entry for LUCENE-10151  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Michael McCandless (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael McCandless updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10557  
 
 
  Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
Change By: 
 Michael McCandless  
 
 
Attachment: 
 Screen Shot 2022-06-05 at 8.13.41 AM.png  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Michael McCandless (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael McCandless updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10557  
 
 
  Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
Change By: 
 Michael McCandless  
 
 
Attachment: 
 Screen Shot 2022-06-05 at 8.13.41 AM.png  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Michael McCandless (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael McCandless commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 So cool!  I asked for all (open and closed) issues from Tomoko Uchida's latest migration, sorting by oldest and I see all the original issues (LUCENE-1, -2, -3, etc.):     
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) 

 If image attachments aren't displayed, see 
this article.

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Michael McCandless (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael McCandless updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10557  
 
 
  Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
Change By: 
 Michael McCandless  
 
 
Attachment: 
 Screen Shot 2022-06-29 at 11.02.35 AM.png  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10151) Add timeout support to IndexSearcher

2022-06-29 Thread Michael Sokolov (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael Sokolov commented on  LUCENE-10151  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Add timeout support to IndexSearcher   
 

  
 
 
 
 

 
 Thanks, Deepika Sharma I've merged this now to main and backported to 9.x One oddity I noticed was a linter failure that happened on 9.x only, but not on main? I don't know if we may have relaxed some checks on main? In any case I added a patch for both branches, which is this change: https://gitbox.apache.org/repos/asf?p=lucene.git;a=commit;h=e078bc1cd9c1e647f963fbdd55cbcd4ec59fac94  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Resolved] (LUCENE-10151) Add timeout support to IndexSearcher

2022-06-29 Thread Michael Sokolov (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Michael Sokolov resolved as Fixed  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Lucene - Core /  LUCENE-10151  
 
 
  Add timeout support to IndexSearcher   
 

  
 
 
 
 

 
Change By: 
 Michael Sokolov  
 
 
Fix Version/s: 
 9.3  
 
 
Resolution: 
 Fixed  
 
 
Status: 
 Open Resolved  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Tomoko Uchida, this came to private@lao: 
 
Betreff: Notification schemes for lucene-jira-archive.git updated Datum: Wed, 29 Jun 2022 14:28:15 - Von: GitBox  Antwort an: priv...@lucene.apache.org An: priv...@lucene.apache.org 
The following notification schemes have been changed on lucene-jira-archive by tomoko: 
 
adding new scheme (commits): 'comm...@lucene.apache.org' 
adding new scheme (issues): 'issues@lucene.apache.org' 
adding new scheme (pullrequests): 'issues@lucene.apache.org' 
adding new scheme (jira_options): 'link label worklog' 
 
With regards, ASF Infra.
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 When we do the migration, we should use some "generic" user / bot account? Otherwise we have "mocobeta" linked on all issues  Maybe theres an account for doing this by INFRA. They have tokens and some bot user in Github that could be used for the migration. We should contact them if they can give us a token (maybe they can create a token just for Lucene). I'd really recommend to talk in Slack with them, using interfaces is a bit slow in discussing such ad hoc solutions.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Maybe a solution to silence all mails during migration would be to use a fake-address below @lucene.apache.org like nore...@lucene.apache.org. The limitation by the automation at infra is possibly limited to the mailing list domain. and mocob...@apache.org has wrong mail domain.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 

Tomoko Uchida, this came to private@lao:
 Thanks Uwe Schindler! Then we'll be able to use the repo to improve migration scripts. 

nore...@lucene.apache.org.
 Sounds good to me - I'll update the yaml once again. 

When we do the migration, we should use some "generic" user / bot account? Otherwise we have "mocobeta" linked on all issues
 We can't run the migration job on ourselves (and I don't want to use my account for it). Actual migration will be done by an INFRA's account. See Lucene.NET project: https://github.com/apache/lucenenet/issues/280 - seems it is still a personal account.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 

We can't run the migration job on ourselves (and I don't want to use my account for it). Actual migration will be done by an INFRA's account. See Lucene.NET project: https://github.com/apache/lucenenet/issues/280 - seems it is still a personal account.
 Chris Lambertus (fluxo) is his private account. I don't like to use that account, too. I would prefer some generic "bot" account   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 noreply@lao did not work. This time it gave an error message!  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Maybe we can ask them to manually disable notifications during the import.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I changed the notification mails to noreply@ so that we can silence them. Could you check the notification in private@ again please, Uwe Schindler? Thanks. https://github.com/apache/lucene-jira-archive/commit/bbcc1b3a77be635b82942971150f37a076ab26b5 

Chris Lambertus (fluxo) is his private account. I don't like to use that account, too. I would prefer some generic "bot" account
 I think it's against GitHub's terms of policy to have multiple free accounts. I'm not sure it is possible though if we have a paid organization account that is not tied to a person, we could ask infra if we use it for the migration? 

noreply@lao did not work. This time it gave an error message!
 Hmm, then I'll revert the change. 

Maybe we can ask them to manually disable notifications during the import.
 We verified that any notifications are not sent when executing migration scripts (importing issues and updating issues/comments), thanks to Houston and Dawid.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida edited a comment on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I changed the notification mails to noreply@ so that we can silence them. Could you check the notification in private@ again please, [~uschindler]? Thanks.https://github.com/apache/lucene-jira-archive/commit/bbcc1b3a77be635b82942971150f37a076ab26b5bq. Chris Lambertus (fluxo) is his private account. I don't like to use that account, too. I would prefer some generic "bot" accountI think it's against GitHub's terms of policy to have multiple free accounts. I'm not sure it is possible though if we have a paid organization account that is not tied to a person, we could ask infra if we use it for the migration?bq. noreply@lao did not work. This time it gave an error message!Hmm, then I'll revert the change.bq. Maybe we can ask them to manually disable notifications during the import. We In the recent migration test where all issues are migrated, we  verified that  any  notifications are not sent when executing migration scripts (importing issues and updating issues/comments), thanks to Houston and Dawid.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 We can't use an arbitrary github account for migration because importing/creating issues with GitHub API requires not only the access token but also admin access to the repo - it is not allowed to have for developers.   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] jpountz opened a new pull request, #996: LUCENE-10151: Some fixes to query timeouts.

2022-06-29 Thread GitBox



jpountz opened a new pull request, #996:
URL: https://github.com/apache/lucene/pull/996

   I noticed some minor bugs in the original PR #927 that this PR should fix:
- When a timeout is set, we would no longer catch
  `CollectionTerminatedException`.
- I added randomization to `LuceneTestCase` to randomly set a timeout, it
  would have caught the above bug.
- Fixed visibility of `TimeLimitingBulkScorer`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I found this issue is an excellent sample for testing - this includes: 
 
Cross-issue link 
Pull Request link 
External link (to an image) 
Attachments (images) and references to them 
Mention to Jira IDs 
Bullet list 
Code block 
Quote 
 So I would add a numbered list and a fake table in this comment to make this more convenient for testing. Please ignore this comment. 
 

 
 
Jira 
GitHub 
 
 
LUCENE-1 
#251 
 
 
LUCENE-2 
#252 
 
 
LUCENE-3 
#253 
 

 
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I know from the past that INFRA had some video conferences with Github representatives, so ASF is not just "some arbitrary customer". I think there was a lot of discussions going on. The LUCENE.NET import was long before they had close contact to Github. I would really prefer to keep all orginal contributors, the change of names to some private account is a real blocker to me. When we can't modify the comment/issue creator mail address to use the official ASF one of the person or use some generic bot account, I would vote now -1 to the migration. P.S.: Spring used a generic user for the import "spring-projects-issues": https://github.com/spring-projects/spring-framework/issues/created_by/spring-projects-issues  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler edited a comment on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Spring also have a cool redirector in their webserver. It only redirects if you don't have some special param: https://jira.spring.io/browse/SPR-17649?redirect=false And they also added a comment at end of all their issues (also by the bot).  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Spring also have a cool redirector in their webserver. It only redirects if you don't have some special param: https://jira.spring.io/browse/SPR-17649?redirect=false  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 

P.S.: Spring used a generic user for the import "spring-projects-issues"
 Yes, I like it. It should be an organization account - maybe we can ask infra if we have one?  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Uwe Schindler (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Uwe Schindler edited a comment on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 Spring also have a cool redirector in their webserver. It only redirects if you don't have some special param: https://jira.spring.io/browse/SPR- 17649 17639 ?redirect=falseAnd they also added a comment at end of all their issues (also by the bot).  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] jpountz commented on a diff in pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



jpountz commented on code in PR #995:
URL: https://github.com/apache/lucene/pull/995#discussion_r910164457


##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -3382,6 +3383,7 @@ private static void checkSortedSetDocValues(
 seenOrds.set(ord);
 ordCount++;
   }
+

Review Comment:
   At this point `ord` is going to be the last ord while it used to always be 
NO_MORE_ORDS, which I suspect may cause CheckIndex failures.



##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -304,25 +306,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-int upto = 0;
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
-  }
-  if (upto == ords.length) {
-ords = ArrayUtil.grow(ords);
-  }
-  ords[upto++] = (int) nextOrd;
-}
-
-if (upto == 0) {
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {

Review Comment:
   likewise here



##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -304,25 +306,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-int upto = 0;
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
-  }
-  if (upto == ords.length) {
-ords = ArrayUtil.grow(ords);
-  }
-  ords[upto++] = (int) nextOrd;
-}
-
-if (upto == 0) {
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {
   // iterator should not have returned this docID if it has no ords:
   assert false;
   ord = (int) NO_MORE_ORDS;
-} else {
-  ord = ords[(upto - 1) >>> 1];
+  return;
+}
+
+ords = ArrayUtil.grow(ords, docValueCount);
+for (int i = 0; i < docValueCount; i++) {
+  ords[i] = (int) in.nextOrd();
 }
+ord = ords[(docValueCount - 1) >>> 1];

Review Comment:
   we don't even need to buffer ords now that we know their number I think?
   We could compute the index of the ord we're interested in, then consume this 
number of ords, and the next ord would be the median ord?



##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -226,12 +226,14 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {

Review Comment:
   docValueCount may never return 0, we can drop this branch



##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -394,25 +390,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-int upto = 0;
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
-  }
-  if (upto == ords.length) {
-ords = ArrayUtil.grow(ords);
-  }
-  ords[upto++] = (int) nextOrd;
-}
-
-if (upto == 0) {
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {
   // iterator should not have returned this docID if it has no ords:
   assert false;
   ord = (int) NO_MORE_ORDS;
-} else {
-  ord = ords[upto >>> 1];
+  return;
+}
+
+ords = ArrayUtil.grow(ords, docValueCount);
+for (int i = 0; i < docValueCount; i++) {
+  ords[i] = (int) in.nextOrd();
 }
+ord = ords[docValueCount >>> 1];

Review Comment:
   and likewise here?



##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -394,25 +390,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-int upto = 0;
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
-  }
-  if (upto == ords.length) {
-ords = ArrayUtil.grow(ords);
-  }
-  ords[upto++] = (int) nextOrd;
-}
-
-if (upto == 0) {
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {

Review Comment:
   we can drop this branch



-- 
This is an automated message fro

[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #4: Which GitHub accont should we use for migration?

2022-06-29 Thread GitBox



mocobeta opened a new issue, #4:
URL: https://github.com/apache/lucene-jira-archive/issues/4

   To import/create issues with GItHub API, you need admin access to the repo 
and we developers are not allowed to have it.
   Actual migration will be done by infra; it seems a personal account was used 
for the import job when Lucene.NET project migrated their issues to GitHub. See 
https://github.com/apache/lucenenet/issues/280.
   
   For example, Spring uses an organization account that is not tied to a 
person (https://github.com/spring-projects/spring-framework/issues/22178). Can 
we do the same? What organization account is available to us?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G closed pull request #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

2022-06-29 Thread GitBox



Yuti-G closed pull request #974: LUCENE-10614: Properly support getTopChildren 
in RangeFacetCounts
URL: https://github.com/apache/lucene/pull/974


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida commented on  LUCENE-10557  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Migrate to GitHub issue from Jira   
 

  
 
 
 
 

 
 I think it looks like we have too many topics to deal with in one issue? We can break up them into sub-jira tasks though, I created a few github issues in the lucene-jira-archive repo.  For example https://github.com/apache/lucene-jira-archive/issues/4 ("Which GitHub account should we use for migration?") Notifications were sent to issues@ this time. Looks fine.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #5: Prepare complete migration script to GitHub issue from Jira (best effort)

2022-06-29 Thread GitBox



mocobeta opened a new issue, #5:
URL: https://github.com/apache/lucene-jira-archive/issues/5

   This is the umbrella to improve migration scripts.
   Sub tasks are:
   - #1 
   - #2 
   - #3 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-10622) Prepare complete migration script to GitHub issue from Jira (best effort)

2022-06-29 Thread Tomoko Uchida (Jira)

Title: Message Title


 
 
 
 

 
 
 

 
   
 Tomoko Uchida resolved as Duplicate  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Moved to https://github.com/apache/lucene-jira-archive/issues/5  
 

  
 
 
 
 

 
 Lucene - Core /  LUCENE-10622  
 
 
  Prepare complete migration script to GitHub issue from Jira (best effort)   
 

  
 
 
 
 

 
Change By: 
 Tomoko Uchida  
 
 
Resolution: 
 Duplicate  
 
 
Status: 
 Open Resolved  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)

[GitHub] [lucene] gsmiller commented on a diff in pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



gsmiller commented on code in PR #995:
URL: https://github.com/apache/lucene/pull/995#discussion_r910315408


##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -3382,6 +3383,7 @@ private static void checkSortedSetDocValues(
 seenOrds.set(ord);
 ordCount++;
   }
+

Review Comment:
   Ack. Yeah that's the issue. I don't think the equality check between `ord` 
and `ord2` after this loop makes sense anymore given that there's no guarantee 
about what value `ord` will be if calling `nextOrd()` more times than 
advertised by `docValueCount()`, so I removed the check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on a diff in pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



gsmiller commented on code in PR #995:
URL: https://github.com/apache/lucene/pull/995#discussion_r910344898


##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -226,12 +226,14 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {

Review Comment:
   Ah right. Thanks! Addressed this in all four places.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on a diff in pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

2022-06-29 Thread GitBox



gsmiller commented on code in PR #995:
URL: https://github.com/apache/lucene/pull/995#discussion_r910345129


##
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##
@@ -304,25 +306,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
 private void setOrd() throws IOException {
   if (docID() != NO_MORE_DOCS) {
-int upto = 0;
-while (true) {
-  long nextOrd = in.nextOrd();
-  if (nextOrd == NO_MORE_ORDS) {
-break;
-  }
-  if (upto == ords.length) {
-ords = ArrayUtil.grow(ords);
-  }
-  ords[upto++] = (int) nextOrd;
-}
-
-if (upto == 0) {
+int docValueCount = in.docValueCount();
+if (docValueCount == 0) {
   // iterator should not have returned this docID if it has no ords:
   assert false;
   ord = (int) NO_MORE_ORDS;
-} else {
-  ord = ords[(upto - 1) >>> 1];
+  return;
+}
+
+ords = ArrayUtil.grow(ords, docValueCount);
+for (int i = 0; i < docValueCount; i++) {
+  ords[i] = (int) in.nextOrd();
 }
+ord = ords[(docValueCount - 1) >>> 1];

Review Comment:
   Good point. Tweaked this (and the other location).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller merged pull request #983: Some refactoring/cleanup of AbstractSortedSetDocValueFacetCounts

2022-06-29 Thread GitBox



gsmiller merged PR #983:
URL: https://github.com/apache/lucene/pull/983


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller merged pull request #984: Switch Float/IntTaxonomyFacets to primitive list data structures in getAllChildren

2022-06-29 Thread GitBox



gsmiller merged PR #984:
URL: https://github.com/apache/lucene/pull/984


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #984: Switch Float/IntTaxonomyFacets to primitive list data structures in getAllChildren

2022-06-29 Thread GitBox



gsmiller commented on PR #984:
URL: https://github.com/apache/lucene/pull/984#issuecomment-1170449045

   Thanks @shaie !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #983: Some refactoring/cleanup of AbstractSortedSetDocValueFacetCounts

2022-06-29 Thread GitBox



gsmiller commented on PR #983:
URL: https://github.com/apache/lucene/pull/983#issuecomment-1170449195

   Thanks @shaie !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller opened a new pull request, #997: Backport GH#983 and GH#984

2022-06-29 Thread GitBox



gsmiller opened a new pull request, #997:
URL: https://github.com/apache/lucene/pull/997

   Using a PR to backport for convenience. No review required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller merged pull request #997: Backport GH#983 and GH#984

2022-06-29 Thread GitBox



gsmiller merged PR #997:
URL: https://github.com/apache/lucene/pull/997


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on a diff in pull request #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

2022-06-29 Thread GitBox



gsmiller commented on code in PR #974:
URL: https://github.com/apache/lucene/pull/974#discussion_r910480915


##
lucene/facet/src/java/org/apache/lucene/facet/range/RangeFacetCounts.java:
##
@@ -232,20 +233,43 @@ public FacetResult getAllChildren(String dim, String... 
path) throws IOException
 return new FacetResult(dim, path, totCount, labelValues, 
labelValues.length);
   }
 
-  // The current getTopChildren method is not returning "top" ranges. Instead, 
it returns all
-  // user-provided ranges in
-  // the order the user specified them when instantiating. This concept is 
being introduced and
-  // supported in the
-  // getAllChildren functionality in LUCENE-10550. getTopChildren is 
temporarily calling
-  // getAllChildren to maintain its
-  // current behavior, and the current implementation will be replaced by an 
actual "top children"
-  // implementation
-  // in LUCENE-10614
-  // TODO: fix getTopChildren in LUCENE-10614
   @Override
   public FacetResult getTopChildren(int topN, String dim, String... path) 
throws IOException {
 validateTopN(topN);
-return getAllChildren(dim, path);
+validateDimAndPathForGetChildren(dim, path);
+
+int resultSize = Math.min(topN, counts.length);
+PriorityQueue pq =
+new PriorityQueue<>(resultSize) {
+  @Override
+  protected boolean lessThan(LabelAndValue a, LabelAndValue b) {
+int cmp = Integer.compare(a.value.intValue(), b.value.intValue());
+if (cmp == 0) {
+  cmp = b.label.compareTo(a.label);
+}
+return cmp < 0;
+  }
+};
+
+for (int i = 0; i < counts.length; i++) {
+  if (pq.size() < resultSize) {
+pq.add(new LabelAndValue(ranges[i].label, counts[i]));

Review Comment:
   Perfect, thank you!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on a diff in pull request #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

2022-06-29 Thread GitBox



gsmiller commented on code in PR #974:
URL: https://github.com/apache/lucene/pull/974#discussion_r910485371


##
lucene/demo/src/java/org/apache/lucene/demo/facet/DistanceFacetsExample.java:
##
@@ -212,7 +212,26 @@ public static Query getBoundingBoxQuery(
   }
 
   /** User runs a query and counts facets. */
-  public FacetResult search() throws IOException {
+  public FacetResult searchAllChildren() throws IOException {
+
+FacetsCollector fc = searcher.search(new MatchAllDocsQuery(), new 
FacetsCollectorManager());
+
+Facets facets =
+new DoubleRangeFacetCounts(
+"field",
+getDistanceValueSource(),
+fc,
+getBoundingBoxQuery(ORIGIN_LATITUDE, ORIGIN_LONGITUDE, 10.0),
+ONE_KM,
+TWO_KM,
+FIVE_KM,
+TEN_KM);
+
+return facets.getAllChildren("field");
+  }
+
+  /** User runs a query and counts facets. */
+  public FacetResult searchTopChildren() throws IOException {

Review Comment:
   Yeah maybe. I think if you can come up with a real-world example that has a 
somewhat high cardinality of children but where you only want a small subset, 
then building an example around that could be useful. Here's one I just thought 
of, but maybe you can come up with something else?
   
   What if, as an example, you indexed error messages in a service log so you 
could do analysis over them. Each document could be an error log entry that 
contains the log message string and also a timestamp for when it occurred. Then 
let's say you wanted to find the top 5 hour periods that had the most errors 
over the past week. To do this, you could create 168 ranges (each for a one 
hour time period; 7 * 24 = 268) and facet on them. Then you could ask for the 
top-5 by count. That would give you the five hour periods over the last week 
with the most errors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] zacharymorn commented on a diff in pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



zacharymorn commented on code in PR #972:
URL: https://github.com/apache/lucene/pull/972#discussion_r910551704


##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;

Review Comment:
   Moved `upTo` as well as a few others into constructor.  



##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;
+
+  // heap of scorers ordered by doc ID
+  private final DisiPriorityQueue essentialsScorers;
+  // list of scorers ordered by maxScore
+  private final LinkedList maxScoreSortedEssentialScorers;
+
+  private final DisiWrapper[] allScorers;
+
+  // sum of max scores of scorers in nonEssentialScorers list
+  private float nonEssentialMaxScoreSum;
+
+  private long cost;
+
+  private final MaxScoreSumPropagator maxScoreSumPropagator;
+
+  // scaled min competitive score
+  private float minCompetitiveScore = 0;
+
+  private int cachedScoredDoc = -1;
+  private float cachedScore = 0;
+
+  /**
+   * Constructs a Scorer that scores doc based on Block-Max-Maxscore (BMM) 
algorithm
+   * http://engineering.nyu.edu/~suel/papers/bmm.pdf . This algorithm has 
lower overhead compared to
+   * WANDScorer, and could be used for simple disjunction queries.
+   *
+   * @param weight The weight to be used.
+   * @param scorers The sub scorers this Scorer should iterate on for optional 
clauses
+   */
+  public BlockMaxMaxscoreScorer(Weight weight, List scorers) throws 
IOException {
+super(weight);
+
+this.doc = -1;
+this.allScorers = new DisiWrapper[scorers.size()];
+this.essentialsScorers = new DisiPriorityQueue(scorers.size());
+this.maxScoreSortedEssentialScorers = new LinkedList<>();
+
+long cost = 0;
+for (int i = 0; i < scorers.size(); i++) {
+  DisiWrapper w = new DisiWrapper(scorers.get(i));
+  cost += w.cost;
+  allScorers[i] = w;
+}
+
+this.cost = cost;
+maxScoreSumPropagator = new MaxScoreSumPropagator(scorers);
+  }
+
+  @Override
+  public DocIdSetIterator iterator() {
+// twoPhaseIterator needed to honor scorer.setMinCompetitiveScore guarantee
+return TwoPhaseIterator.asDocIdSetIterator(twoPhaseIterator());
+  }
+
+  @Override
+  public TwoPhaseIterator twoPhaseIterator() {
+DocIdSetIterator approximation =
+new DocIdSetIterator() {
+
+  @Override
+

[GitHub] [lucene] zacharymorn commented on a diff in pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



zacharymorn commented on code in PR #972:
URL: https://github.com/apache/lucene/pull/972#discussion_r910551829


##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;
+
+  // heap of scorers ordered by doc ID
+  private final DisiPriorityQueue essentialsScorers;
+  // list of scorers ordered by maxScore
+  private final LinkedList maxScoreSortedEssentialScorers;
+
+  private final DisiWrapper[] allScorers;
+
+  // sum of max scores of scorers in nonEssentialScorers list
+  private float nonEssentialMaxScoreSum;
+
+  private long cost;
+
+  private final MaxScoreSumPropagator maxScoreSumPropagator;
+
+  // scaled min competitive score
+  private float minCompetitiveScore = 0;
+
+  private int cachedScoredDoc = -1;
+  private float cachedScore = 0;
+
+  /**
+   * Constructs a Scorer that scores doc based on Block-Max-Maxscore (BMM) 
algorithm
+   * http://engineering.nyu.edu/~suel/papers/bmm.pdf . This algorithm has 
lower overhead compared to
+   * WANDScorer, and could be used for simple disjunction queries.
+   *
+   * @param weight The weight to be used.
+   * @param scorers The sub scorers this Scorer should iterate on for optional 
clauses
+   */
+  public BlockMaxMaxscoreScorer(Weight weight, List scorers) throws 
IOException {
+super(weight);
+
+this.doc = -1;
+this.allScorers = new DisiWrapper[scorers.size()];
+this.essentialsScorers = new DisiPriorityQueue(scorers.size());
+this.maxScoreSortedEssentialScorers = new LinkedList<>();
+
+long cost = 0;
+for (int i = 0; i < scorers.size(); i++) {
+  DisiWrapper w = new DisiWrapper(scorers.get(i));
+  cost += w.cost;
+  allScorers[i] = w;
+}
+
+this.cost = cost;
+maxScoreSumPropagator = new MaxScoreSumPropagator(scorers);
+  }
+
+  @Override
+  public DocIdSetIterator iterator() {
+// twoPhaseIterator needed to honor scorer.setMinCompetitiveScore guarantee
+return TwoPhaseIterator.asDocIdSetIterator(twoPhaseIterator());
+  }
+
+  @Override
+  public TwoPhaseIterator twoPhaseIterator() {
+DocIdSetIterator approximation =
+new DocIdSetIterator() {
+
+  @Override
+  public int docID() {
+return doc;
+  }
+
+  @Override
+  public int nextDoc() throws IOException {
+return advance(doc + 1);
+  }
+
+  @Override
+  public int advance(int target) throws IOException {
+while (true) {
+
+  if (target > upTo) {
+updateMaxScoresAndLists(target);
+  } else {
+// minCompetitiveScore might have increased,
+// move potentially no-longer-competitive scorers from 
essential to non-essential
+// list
+movePotentiallyNonCompetitiveScorers();
+  }
+
+  assert target <= upTo;
+
+  DisiWrapper top = essentialsScorers.top();
+
+  if (top == null) {
+// all scorers in non-essential list, skip to next boundary or 
return no_more_docs
+if (upTo == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else {
+  target = upTo + 1;
+}
+  } else {
+// position all scorers in essential list to on or after target
+while (top.doc < target) {
+  top.doc = top.iterator.advance(target);
+  top = essentialsScorers.updateTop();
+}
+
+if (top.doc == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else if (

[GitHub] [lucene] zacharymorn commented on a diff in pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



zacharymorn commented on code in PR #972:
URL: https://github.com/apache/lucene/pull/972#discussion_r910552757


##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;
+
+  // heap of scorers ordered by doc ID
+  private final DisiPriorityQueue essentialsScorers;
+  // list of scorers ordered by maxScore
+  private final LinkedList maxScoreSortedEssentialScorers;
+
+  private final DisiWrapper[] allScorers;
+
+  // sum of max scores of scorers in nonEssentialScorers list
+  private float nonEssentialMaxScoreSum;
+
+  private long cost;
+
+  private final MaxScoreSumPropagator maxScoreSumPropagator;
+
+  // scaled min competitive score
+  private float minCompetitiveScore = 0;
+
+  private int cachedScoredDoc = -1;
+  private float cachedScore = 0;
+
+  /**
+   * Constructs a Scorer that scores doc based on Block-Max-Maxscore (BMM) 
algorithm
+   * http://engineering.nyu.edu/~suel/papers/bmm.pdf . This algorithm has 
lower overhead compared to
+   * WANDScorer, and could be used for simple disjunction queries.
+   *
+   * @param weight The weight to be used.
+   * @param scorers The sub scorers this Scorer should iterate on for optional 
clauses
+   */
+  public BlockMaxMaxscoreScorer(Weight weight, List scorers) throws 
IOException {
+super(weight);
+
+this.doc = -1;
+this.allScorers = new DisiWrapper[scorers.size()];
+this.essentialsScorers = new DisiPriorityQueue(scorers.size());
+this.maxScoreSortedEssentialScorers = new LinkedList<>();
+
+long cost = 0;
+for (int i = 0; i < scorers.size(); i++) {
+  DisiWrapper w = new DisiWrapper(scorers.get(i));
+  cost += w.cost;
+  allScorers[i] = w;
+}
+
+this.cost = cost;
+maxScoreSumPropagator = new MaxScoreSumPropagator(scorers);
+  }
+
+  @Override
+  public DocIdSetIterator iterator() {
+// twoPhaseIterator needed to honor scorer.setMinCompetitiveScore guarantee
+return TwoPhaseIterator.asDocIdSetIterator(twoPhaseIterator());
+  }
+
+  @Override
+  public TwoPhaseIterator twoPhaseIterator() {
+DocIdSetIterator approximation =
+new DocIdSetIterator() {
+
+  @Override
+  public int docID() {
+return doc;
+  }
+
+  @Override
+  public int nextDoc() throws IOException {
+return advance(doc + 1);
+  }
+
+  @Override
+  public int advance(int target) throws IOException {
+while (true) {
+
+  if (target > upTo) {
+updateMaxScoresAndLists(target);
+  } else {
+// minCompetitiveScore might have increased,
+// move potentially no-longer-competitive scorers from 
essential to non-essential
+// list
+movePotentiallyNonCompetitiveScorers();
+  }
+
+  assert target <= upTo;
+
+  DisiWrapper top = essentialsScorers.top();
+
+  if (top == null) {
+// all scorers in non-essential list, skip to next boundary or 
return no_more_docs
+if (upTo == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else {
+  target = upTo + 1;
+}
+  } else {
+// position all scorers in essential list to on or after target
+while (top.doc < target) {
+  top.doc = top.iterator.advance(target);
+  top = essentialsScorers.updateTop();
+}
+
+if (top.doc == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else if (

[GitHub] [lucene] zacharymorn commented on a diff in pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



zacharymorn commented on code in PR #972:
URL: https://github.com/apache/lucene/pull/972#discussion_r910552971


##
lucene/core/src/java/org/apache/lucene/search/BlockMaxMaxscoreScorer.java:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+
+/** Scorer implementing Block-Max Maxscore algorithm */
+public class BlockMaxMaxscoreScorer extends Scorer {
+  // current doc ID of the leads
+  private int doc;
+
+  // doc id boundary that all scorers maxScore are valid
+  private int upTo = -1;
+
+  // heap of scorers ordered by doc ID
+  private final DisiPriorityQueue essentialsScorers;
+  // list of scorers ordered by maxScore
+  private final LinkedList maxScoreSortedEssentialScorers;
+
+  private final DisiWrapper[] allScorers;
+
+  // sum of max scores of scorers in nonEssentialScorers list
+  private float nonEssentialMaxScoreSum;
+
+  private long cost;
+
+  private final MaxScoreSumPropagator maxScoreSumPropagator;
+
+  // scaled min competitive score
+  private float minCompetitiveScore = 0;
+
+  private int cachedScoredDoc = -1;
+  private float cachedScore = 0;
+
+  /**
+   * Constructs a Scorer that scores doc based on Block-Max-Maxscore (BMM) 
algorithm
+   * http://engineering.nyu.edu/~suel/papers/bmm.pdf . This algorithm has 
lower overhead compared to
+   * WANDScorer, and could be used for simple disjunction queries.
+   *
+   * @param weight The weight to be used.
+   * @param scorers The sub scorers this Scorer should iterate on for optional 
clauses
+   */
+  public BlockMaxMaxscoreScorer(Weight weight, List scorers) throws 
IOException {
+super(weight);
+
+this.doc = -1;
+this.allScorers = new DisiWrapper[scorers.size()];
+this.essentialsScorers = new DisiPriorityQueue(scorers.size());
+this.maxScoreSortedEssentialScorers = new LinkedList<>();
+
+long cost = 0;
+for (int i = 0; i < scorers.size(); i++) {
+  DisiWrapper w = new DisiWrapper(scorers.get(i));
+  cost += w.cost;
+  allScorers[i] = w;
+}
+
+this.cost = cost;
+maxScoreSumPropagator = new MaxScoreSumPropagator(scorers);
+  }
+
+  @Override
+  public DocIdSetIterator iterator() {
+// twoPhaseIterator needed to honor scorer.setMinCompetitiveScore guarantee
+return TwoPhaseIterator.asDocIdSetIterator(twoPhaseIterator());
+  }
+
+  @Override
+  public TwoPhaseIterator twoPhaseIterator() {
+DocIdSetIterator approximation =
+new DocIdSetIterator() {
+
+  @Override
+  public int docID() {
+return doc;
+  }
+
+  @Override
+  public int nextDoc() throws IOException {
+return advance(doc + 1);
+  }
+
+  @Override
+  public int advance(int target) throws IOException {
+while (true) {
+
+  if (target > upTo) {
+updateMaxScoresAndLists(target);
+  } else {
+// minCompetitiveScore might have increased,
+// move potentially no-longer-competitive scorers from 
essential to non-essential
+// list
+movePotentiallyNonCompetitiveScorers();
+  }
+
+  assert target <= upTo;
+
+  DisiWrapper top = essentialsScorers.top();
+
+  if (top == null) {
+// all scorers in non-essential list, skip to next boundary or 
return no_more_docs
+if (upTo == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else {
+  target = upTo + 1;
+}
+  } else {
+// position all scorers in essential list to on or after target
+while (top.doc < target) {
+  top.doc = top.iterator.advance(target);
+  top = essentialsScorers.updateTop();
+}
+
+if (top.doc == NO_MORE_DOCS) {
+  return doc = NO_MORE_DOCS;
+} else if (

[GitHub] [lucene] zacharymorn commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-29 Thread GitBox



zacharymorn commented on PR #972:
URL: https://github.com/apache/lucene/pull/972#issuecomment-1170684358

   
   > With this change, I suspect that some scorers created in `TestWANDScorer` 
would now use your new `BlockMaxMaxScoreScorer`, which is going to decrease the 
coverage of WANDScorer. Can we somehow make sure that `TestWANDScorer` always 
gets a `WANDScorer`? E.g. I spotted this query under 
`TestWANDScorer#testBasics` which likely uses your now scorer:
   > 
   > ```java
   > //  test a filtered disjunction
   > query =
   > new BooleanQuery.Builder()
   > .add(
   > new BooleanQuery.Builder()
   > .add(
   > new BoostQuery(
   > new ConstantScoreQuery(new TermQuery(new 
Term("foo", "A"))), 2),
   > Occur.SHOULD)
   > .add(new ConstantScoreQuery(new TermQuery(new 
Term("foo", "B"))), Occur.SHOULD)
   > .build(),
   > Occur.MUST)
   > .add(new TermQuery(new Term("foo", "C")), Occur.FILTER)
   > .build();
   > ```
   
   Yeah this is a good question. In my newly added tests I have used something 
like this to confirm it's testing the right scorer, but I'm not totally happy 
about this approach myself :
   ```
   if (scorer instanceof AssertingScorer) {
   assertTrue(((AssertingScorer) scorer).getIn() instanceof 
BlockMaxMaxscoreScorer);
   } else {
   assertTrue(scorer instanceof BlockMaxMaxscoreScorer);
   }
   ```
   
   One alternative approach could be instantiating `WANDScorer` directly inside 
the test for lower level tests, and moving the higher level tests into another 
test class that doesn't care about the specific scorer implementation for 
disjunction? This may require duplicating some code from `BooleanWeight`, 
`AssertingWeight` etc though but should be do-able.  
   
   On the other hand, if we don't plan on initiating `WANDScorer` directly in 
the test, varying the query clauses and asserting like above might be the best 
we could do I feel? This has the potential test coverage decrease issue as you 
suggested so may not be ideal either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-10632) Change getAllChildren to return all children regardless of the count

2022-06-29 Thread Yuting Gan (Jira)

Yuting Gan created LUCENE-10632:
---

 Summary: Change getAllChildren to return all children regardless 
of the count
 Key: LUCENE-10632
 URL: https://issues.apache.org/jira/browse/LUCENE-10632
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Yuting Gan


Currently, the getAllChildren functionality is implemented in a way that is 
similar to getTopChildren, where they only return children with count that is 
greater than zero.

However, he original getTopChildren in RangeFacetCounts returned all children 
whether-or-not the count was zero. This actually has good use cases and we 
should continue supporting the feature in getAllChildren, so that we will not 
lose it after properly supporting getTopChildren in RangeFacetCounts.

As discussed with [~gsmiller] in the [LUCENE-10614 
pr|https://github.com/apache/lucene/pull/974], allowing getAllChildren to 
behave differently from getTopChildren can actually be more helpful for users. 
If users want to get children with only positive count, we have getTopChildren 
supporting this behavior already. Therefore, the getAllChildren API should 
provide all children in all of the implementations, whether-or-not the count is 
zero.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

80 matches

Mail list logo