date:20220620

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-20 Thread GitBox



jpountz commented on PR #964:
URL: https://github.com/apache/lucene/pull/964#issuecomment-1160079281

   Now when collectors need to count hits too (I changed IndexSearcher's 
`TOTAL_HITS_THRESHOLD` to `Integer.MAX_VALUE`):
   
   ```
   TaskQPS baseline  StdDevQPS 
my_modified_version  StdDevPct diff p-value
  OrHighLow   90.31  (6.7%)   35.49  
(1.8%)  -60.7% ( -64% -  -55%) 0.000
 OrHighHigh   40.98  (5.6%)   21.17  
(2.4%)  -48.3% ( -53% -  -42%) 0.000
   OrHighNotLow  143.61  (8.2%)   76.04  
(5.4%)  -47.1% ( -56% -  -36%) 0.000
   OrHighNotMed   88.77  (7.7%)   49.41  
(5.6%)  -44.3% ( -53% -  -33%) 0.000
  OrHighNotHigh   18.24  (7.4%)   10.59  
(5.9%)  -41.9% ( -51% -  -30%) 0.000
  OrHighMed   80.82  (5.0%)   48.18  
(2.8%)  -40.4% ( -45% -  -34%) 0.000
  OrNotHighHigh   51.35  (5.7%)   39.11  
(5.6%)  -23.8% ( -33% -  -13%) 0.000
AndHighHigh   53.49  (1.9%)   41.97  
(4.1%)  -21.5% ( -27% -  -15%) 0.000
 AndHighMed  321.43  (2.4%)  258.39  
(4.5%)  -19.6% ( -25% -  -13%) 0.000
 AndHighLow 1777.06  (2.7%) 1474.52  
(3.1%)  -17.0% ( -22% -  -11%) 0.000
  MedPhrase  391.41  (5.9%)  332.93  
(5.1%)  -14.9% ( -24% -   -4%) 0.000
   OrNotHighMed  313.44  (6.7%)  269.25  
(5.3%)  -14.1% ( -24% -   -2%) 0.000
   OrNotHighLow 1977.65  (4.2%) 1803.88  
(4.7%)   -8.8% ( -16% -0%) 0.000
   AndHighHighDayTaxoFacets   25.28  (1.7%)   23.30  
(1.9%)   -7.8% ( -11% -   -4%) 0.000
   MedTermDayTaxoFacets   79.97  (2.6%)   74.42  
(3.6%)   -6.9% ( -12% -0%) 0.000
Prefix3   27.72  (6.2%)   25.83  
(5.2%)   -6.8% ( -17% -4%) 0.000
  LowPhrase  159.63  (5.0%)  148.90  
(3.3%)   -6.7% ( -14% -1%) 0.000
 OrHighMedDayTaxoFacets   19.30  (5.7%)   18.11  
(4.2%)   -6.2% ( -15% -3%) 0.000
 HighPhrase   16.15  (5.7%)   15.30  
(4.6%)   -5.2% ( -14% -5%) 0.001
   Wildcard   79.98  (2.3%)   76.50  
(3.0%)   -4.4% (  -9% -1%) 0.000
AndHighMedDayTaxoFacets   72.60  (2.1%)   69.79  
(1.9%)   -3.9% (  -7% -0%) 0.000
   HighSpanNear   44.71  (4.9%)   43.00  
(4.7%)   -3.8% ( -12% -6%) 0.012
  BrowseDayOfYearTaxoFacets   47.80  (2.0%)   46.05 
(12.3%)   -3.7% ( -17% -   10%) 0.189
 Fuzzy2  103.64  (2.1%)  100.39  
(1.7%)   -3.1% (  -6% -0%) 0.000
   BrowseDateTaxoFacets   46.13  (1.9%)   44.69 
(12.0%)   -3.1% ( -16% -   10%) 0.249
BrowseRandomLabelTaxoFacets   37.71  (2.1%)   36.53 
(10.5%)   -3.1% ( -15% -9%) 0.195
MedSpanNear   68.62  (3.0%)   66.69  
(3.0%)   -2.8% (  -8% -3%) 0.003
LowSpanNear   57.05  (3.0%)   55.49  
(2.8%)   -2.7% (  -8% -3%) 0.003
  BrowseMonthTaxoFacets   29.68  (7.3%)   28.87 
(12.8%)   -2.7% ( -21% -   18%) 0.410
 Fuzzy1  128.59  (2.2%)  125.27  
(1.8%)   -2.6% (  -6% -1%) 0.000
LowIntervalsOrdered  219.66  (4.2%)  216.18  
(3.4%)   -1.6% (  -8% -6%) 0.184
   HighIntervalsOrdered   35.55  (5.7%)   35.03  
(4.3%)   -1.5% ( -10% -9%) 0.361
   HighSloppyPhrase8.33 (15.0%)8.22 
(13.6%)   -1.3% ( -25% -   32%) 0.775
  BrowseDayOfYearSSDVFacets   21.93  (9.9%)   21.80  
(9.3%)   -0.6% ( -17% -   20%) 0.841
  BrowseMonthSSDVFacets   23.61  (8.5%)   23.53  
(8.2%)   -0.3% ( -15% -   17%) 0.904
Respell   77.59  (2.0%)   77.42  
(2.3%)   -0.2% (  -4% -4%) 0.740
BrowseRandomLabelSSDVFacets   15.20  (5.5%)   15.19  
(5.7%)   -0.1% ( -10% -   11%) 0.971
MedIntervalsOrdered   43.08  (5.0%)   43.14  
(4.6%)0.1% (  -9% -   10%) 0.934
LowSloppyPhrase   54.27 (10.7%)   54.76 
(10.1%)0.9% ( -17% -   24%) 0.782
   BrowseDateSSDVFacets4.21 (12.4%)4.26 
(11.8%)1.0% ( -20% -   28%) 0.784
   PKLooku

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-20 Thread GitBox



jpountz commented on PR #964:
URL: https://github.com/apache/lucene/pull/964#issuecomment-1160179472

   Unfortunately this is challenging to do right at the moment since the API 
requires the collector to tell the `ScoreMode` it needs to be able to create 
the `Weight`. So either the collector says it needs to evaluate all hits 
(`ScoreMode.COMPLETE`) and then we cannot skip hits in the case when the weight 
can count its hits efficiently. Or it says it doesn't (`ScoreMode.TOP_SCORES`), 
like the PR does at the moment and then queries get slower when the weight 
cannot count hits. We could fix this by moving the score mode to 
`LeafCollector` instead of `Collector` but this would be a big change...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-20 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556288#comment-17556288
 ] 

Tomoko Uchida commented on LUCENE-10557:


> I've added a few bullet points that script could/should handle under 
> LUCENE-10557, hope you don't mind. If you place these script(s) in the open 
> then perhaps indeed we could try to collaborate and see what can be done.

Thanks for your suggestions, Dawid. I'd move the conversation to this issue 
from the mail list. I think we'll be able to handle the requirements 
(cross-issue links, and so on) in some ways. I started work on LUCENE-10622 and 
added the link to the sandbox repository where the migration scripts (early 
draft) were pushed.

For what it's worth, LUCENE-1 will be migrated something like this. Although 
the formatting and look-and-feel could be improved a bit, it would not be 
drastically changed in essentials. We cannot simulate Jira issues on GitHub. 
e.g.; it is not allowed to tweak the issue reporter and timestamp (very basic 
metadata to me), so they have to be embedded in the issue description as free 
texts. I'll continue to work on though - does this really meet your 
expectations, Mike McCandless and other folks who argue to preserve all issue 
history in GItHub?
https://github.com/mocobeta/sandbox-lucene-10557/issues/19

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * Choose issues that should be moved to GitHub
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses. 
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
>  * Build the convention for issue label/milestone management
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on pull request #967: LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues

2022-06-20 Thread GitBox



jpountz commented on PR #967:
URL: https://github.com/apache/lucene/pull/967#issuecomment-1160204024

   Thanks for catching this bug. The fix is a bit wasteful in that it requires 
iterating over ords twice, once to count them and another time to iterate 
through them. Maybe we should change `DocOrds` to also record the number of 
ords for each doc (e.g. using a `GrowableWriter`), and stop recording zeroes to 
signal that all ords for a document have been consumed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-20 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556306#comment-17556306
 ] 

Dawid Weiss commented on LUCENE-10557:
--

I've verified that searches for old issue numbers seem to work:
https://github.com/mocobeta/sandbox-lucene-10557/search?q=%22LUCENE-1%22+in%3Atitle&type=issues

I'm more familiar with the "hierarchical" tags like "affects/xyz" or "type/bug" 
but I can live with the comma version. Good to have some of the metadata 
transferred as well, even as a plain text content in the issue description.

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * Choose issues that should be moved to GitHub
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses. 
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
>  * Build the convention for issue label/milestone management
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz merged pull request #965: LUCENE-10618: Implement BooleanQuery rewrite rules based for minimumShouldMatch

2022-06-20 Thread GitBox



jpountz merged PR #965:
URL: https://github.com/apache/lucene/pull/965


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10618) Implement BooleanQuery rewrite rules based for minimumShouldMatch

2022-06-20 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556345#comment-17556345
 ] 

ASF subversion and git services commented on LUCENE-10618:
--

Commit bb1b3dce04c06e7533b2ff418b8b7c2544534e24 in lucene's branch 
refs/heads/branch_9x from JoeHF
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=bb1b3dce04c ]

LUCENE-10618: Implement BooleanQuery rewrite rules based for minimumShouldMatch 
(#965)



> Implement BooleanQuery rewrite rules based for minimumShouldMatch
> -
>
> Key: LUCENE-10618
> URL: https://issues.apache.org/jira/browse/LUCENE-10618
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While looking into a test failure I noticed that we sometimes create weights 
> for boolean queries with no SHOULD clauses and a non-zero 
> minimumNumberShouldMatch.
> We could rewrite BooleanQuery to MatchNoDocsQuery when the number of SHOULD 
> clauses is less than minimumNumberShouldMatch, and make SHOULD clauses 
> required when the number of SHOULD clauses is equal to 
> minimumNumberShouldMatch.
> This feels a bit like a degenerate case (why would the use create such a 
> query in the first place?) but this case can also happen to non-degenerate 
> queries if some SHOULD clauses rewrite to a MatchNoDocsQuery and get removed 
> through rewrite.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-10618) Implement BooleanQuery rewrite rules based for minimumShouldMatch

2022-06-20 Thread Adrien Grand (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-10618.
---
Fix Version/s: 9.3
   Resolution: Fixed

Thanks [~joe hou]!

> Implement BooleanQuery rewrite rules based for minimumShouldMatch
> -
>
> Key: LUCENE-10618
> URL: https://issues.apache.org/jira/browse/LUCENE-10618
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 9.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While looking into a test failure I noticed that we sometimes create weights 
> for boolean queries with no SHOULD clauses and a non-zero 
> minimumNumberShouldMatch.
> We could rewrite BooleanQuery to MatchNoDocsQuery when the number of SHOULD 
> clauses is less than minimumNumberShouldMatch, and make SHOULD clauses 
> required when the number of SHOULD clauses is equal to 
> minimumNumberShouldMatch.
> This feels a bit like a degenerate case (why would the use create such a 
> query in the first place?) but this case can also happen to non-degenerate 
> queries if some SHOULD clauses rewrite to a MatchNoDocsQuery and get removed 
> through rewrite.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-20 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556362#comment-17556362
 ] 

Tomoko Uchida commented on LUCENE-10557:


As for User ID alignment, it'd be great if we can map the 
reporter/assignee/commenter to correct GitHub accounts. I just wanted to note 
that there is a trivial but very practical concern for me - we have to 
"mention" the accounts in the issue description/comment to create a hyperlink 
(we can't create resources on behalf of the original authors). I think we don't 
want to receive a huge volume of notifications from old issues. There could be 
a tip or workaround, otherwise we will not be able to create real links but 
just have markups like`@mocobeta`.

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * Choose issues that should be moved to GitHub
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses. 
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
>  * Build the convention for issue label/milestone management
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** Give some time to committers to play around with issues/labels/milestones 
> before the actual migration
>  ** Make an announcement on the mail lists
>  ** Show some text messages when opening a new Jira issue (in issue template?)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] LuXugang commented on pull request #967: LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues

2022-06-20 Thread GitBox



LuXugang commented on PR #967:
URL: https://github.com/apache/lucene/pull/967#issuecomment-1160497729

   > Thanks for catching this bug. The fix is a bit wasteful in that it 
requires iterating over ords twice, once to count them and another time to 
iterate through them. Maybe we should change `DocOrds` to also record the 
number of ords for each doc (e.g. using a `GrowableWriter`), and stop recording 
zeroes to signal that all ords for a document have been consumed?
   
   Thanks for your suggestion, @jpountz . Remove the sentinel value  zero and 
use GrowableWriter could make code more readable!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-20 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556475#comment-17556475
 ] 

Tomoko Uchida commented on LUCENE-10557:


I browsed through several JSON dumps of Jira issues. These are some 
observations.
 - It'd be easy to extract various metadata of issues (reporter id, status, 
created timestamp, etc.)
 - It'd be easy to extract all linked issue ids and sub-task ids
 - It'd be easy to extract all attached file URLs

 -- Can't estimate how many hours it will take to download all of the files
 - it'd be easy to extract all comments in an issue
 -- Perhaps pagination is needed for issues with many comments
 - We can apply parser/converter tools to convert the jira markups to markdown
 -- I think this can be error-prone
 - It'd be cumbersome to extract GitHub PR links. The links to PRs only appear 
in the github bot's comments in the Work Log.

On GitHub side, there are no difficulties in dealing with the APIs.
 - It'd be a bit tedious to work with milestones via APIs. They can't be 
referred to by their text. Id - text mapping is needed
 - It might need some trials and errors to properly place attached files in 
their right place

As for the cross-link conversion and account mapping script:
 - To "embed" github issue links / accounts in their right place (maybe next to 
the Jira issue keys / user names), we need to modify the original text. This 
can be tricky and the riskiest part to me. Instead of modifying the original 
text, we could just add some footnotes for the issues/comments - but it could 
considerably damage the readability.

Yes it should be possible with a set of small scripts. Maybe one problem is 
that it'd be difficult to detect conversion errors/omissions and we can't 
correct them ourselves if we notice migration errors later (it seems we are not 
to be allowed to have the github tokens of the ASF repository).

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * Choose issues that should be moved to GitHub
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration script - the decision could be made on that. 
> Things to consider:
>  *** version numbers - labels or milestones?
>  *** add a comment/ prepend a link to the source Jira issue on github side,
>  *** add a comment/ prepend a link on the jira side to the new issue on 
> github side (for people who access jira from blogs, mailing list archives and 
> other sources that will have stale links),
>  *** convert cross-issue automatic links in comments/ descriptions (as 
> suggested by Robert),
>  *** strategy to deal with sub-issues (hierarchies),
>  *** maybe prefix (or postfix) the issue title on github side with the 
> original LUCENE-XYZ key so that it is easier to search for a particular issue 
> there?
>  *** how to deal with user IDs (author, reporter, commenters)? Do they have 
> to be github users? Will information about people not registered on github be 
> lost?
>  *** create an extra mapping file of old-issue-new-issue URLs for any 
> potential future uses. 
>  *** what to do with issue numbers in git/svn commits? These could be 
> rewritten but it'd change the entire git history tree - I don't think this is 
> practical, while doable.
>  * Build the convention for issue label/milestone management
>  ** Do some experiments on a sandbox repository 
> [https://github.com/mocobeta/sandbox-lucene-10557]
>  ** Make documentation for metadata (label/milestone) management 
>  * Enable Github issue on the lucene's repository
>  ** Raise an issue on INFRA
>  ** (Create an issue-only private repository for sensitive issues if it's 
> needed and allowed)
>  ** Set a mail hook to 
> [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to 
> the general mail group name)
>  * Set a schedule for migration
>  ** Give some time to committers to play around with issues/labels/miles

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-20 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556475#comment-17556475
 ] 

Tomoko Uchida edited comment on LUCENE-10557 at 6/20/22 5:20 PM:
-

I browsed through several JSON dumps of Jira issues. These are some 
observations.
 - It'd be easy to extract various metadata of issues (reporter id, status, 
created timestamp, etc.)
 - It'd be easy to extract all linked issue ids and sub-task ids
 - It'd be easy to extract all attached file URLs
 -- Can't estimate how many hours it will take to download all of the files
 - it'd be easy to extract all comments in an issue
 -- Perhaps pagination is needed for issues with many comments
 - We can apply parser/converter tools to convert the jira markups to markdown
 -- I think this can be error-prone
 - It'd be cumbersome to extract GitHub PR links. The links to PRs only appear 
in the github bot's comments in the Work Log.

On GitHub side, there are no difficulties in dealing with the APIs.
 - It'd be a bit tedious to work with milestones via APIs. They can't be 
referred to by their text. Id - text mapping is needed
 - It might need some trials and errors to properly place attached files in 
their right place

As for the cross-link conversion and account mapping script:
 - To "embed" github issue links / accounts in their right place (maybe next to 
the Jira issue keys / user names), we need to modify the original text. This 
can be tricky and the riskiest part to me. Instead of modifying the original 
text, we could just add some footnotes for the issues/comments - but it could 
considerably damage the readability.

Yes it should be possible with a set of small scripts. Maybe one problem is 
that it'd be difficult to detect conversion errors/omissions and we can't 
correct them ourselves if we notice migration errors later (it seems we are not 
to be allowed to have the github tokens of the ASF repository).


was (Author: tomoko uchida):
I browsed through several JSON dumps of Jira issues. These are some 
observations.
 - It'd be easy to extract various metadata of issues (reporter id, status, 
created timestamp, etc.)
 - It'd be easy to extract all linked issue ids and sub-task ids
 - It'd be easy to extract all attached file URLs

 -- Can't estimate how many hours it will take to download all of the files
 - it'd be easy to extract all comments in an issue
 -- Perhaps pagination is needed for issues with many comments
 - We can apply parser/converter tools to convert the jira markups to markdown
 -- I think this can be error-prone
 - It'd be cumbersome to extract GitHub PR links. The links to PRs only appear 
in the github bot's comments in the Work Log.

On GitHub side, there are no difficulties in dealing with the APIs.
 - It'd be a bit tedious to work with milestones via APIs. They can't be 
referred to by their text. Id - text mapping is needed
 - It might need some trials and errors to properly place attached files in 
their right place

As for the cross-link conversion and account mapping script:
 - To "embed" github issue links / accounts in their right place (maybe next to 
the Jira issue keys / user names), we need to modify the original text. This 
can be tricky and the riskiest part to me. Instead of modifying the original 
text, we could just add some footnotes for the issues/comments - but it could 
considerably damage the readability.

Yes it should be possible with a set of small scripts. Maybe one problem is 
that it'd be difficult to detect conversion errors/omissions and we can't 
correct them ourselves if we notice migration errors later (it seems we are not 
to be allowed to have the github tokens of the ASF repository).

> Migrate to GitHub issue from Jira
> -
>
> Key: LUCENE-10557
> URL: https://issues.apache.org/jira/browse/LUCENE-10557
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> A few (not the majority) Apache projects already use the GitHub issue instead 
> of Jira. For example,
> Airflow: [https://github.com/apache/airflow/issues]
> BookKeeper: [https://github.com/apache/bookkeeper/issues]
> So I think it'd be technically possible that we move to GitHub issue. I have 
> little knowledge of how to proceed with it, I'd like to discuss whether we 
> should migrate to it, and if so, how to smoothly handle the migration.
> The major tasks would be:
>  * (/) Get a consensus about the migration among committers
>  * Choose issues that should be moved to GitHub
>  ** Discussion thread 
> [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12]
>  ** -Conclusion for now: We don't migrate any issues. Only new issues should 
> be opened on GitHub.-
>  ** Write a prototype migration

[GitHub] [lucene] jtibshirani commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-20 Thread GitBox



jtibshirani commented on PR #951:
URL: https://github.com/apache/lucene/pull/951#issuecomment-1160945431

   @kaivalnp just wanted to check how this is going. I'm excited about this 
improvement. Let me know if I can help with anything, for example I could dig 
into the questions that Adrien and I raised earlier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Status: Patch Available  (was: Open)

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm. 
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Attachment: baseline_sparseTaxis_searchsparse-sorted.0.log

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm. 
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Attachment: candidate_sparseTaxis_searchsparse-sorted.0.log

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm. 
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Description: 
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
baseline and candidates.{color}

{color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}

{color:#1d1c1d}2. Some highlights:{color}
 * {color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}
 ** {color:#1d1c1d}Baseline:  10973978+ hits hits in 726.81967 msec{color}
 ** {color:#1d1c1d}Candidate: 10973978+ hits hits in 484.544594 msec{color}

  was:
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 


By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm. 
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
 


> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
> baseline and candidates.{color}
> {color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}
> {color:#1d1c1d}2. Some highlights:{color}
>  * {color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}
>  ** {color:#1d1c1d}Baseline:  10973978+ hits hits in 726.81967 msec{color}
>  ** {color:#1d1c1d}Candidate: 10973978+ hits hits in 484.544594 msec{color}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Description: 
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
baseline and candidates.{color}

{color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}

{color:#1d1c1d}2. Some highlights (>20%):{color}
 * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
 ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in 726.81967 msec{color}
 ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in 484.544594 msec{color}
 * {color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in 95.698324 msec{color}
 ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in 78.336193 msec{color}

  was:
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
baseline and candidates.{color}

{color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}

{color:#1d1c1d}2. Some highlights:{color}
 * {color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}
 ** {color:#1d1c1d}Baseline:  10973978+ hits hits in 726.81967 msec{color}
 ** {color:#1d1c1d}Candidate: 10973978+ hits hits in 484.544594 msec{color}


> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
> baseline and candidates.{color}
> {color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in 726.81967 msec{color}
>  ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in 484.544594 msec{color}
>  * {color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in 95.698324 msec{color}
>  ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in 78.336193 msec{c

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556662#comment-17556662
 ] 

Weiming Wu commented on LUCENE-10624:
-

Added benchmark data to the content.

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments section.
> {color}
> {color:#1d1c1d}1. Most cases have ~10% search latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 
> msec*{color}
>  * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Description: 
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the reports 
of baseline and candidates in attachments section.
{color}

{color:#1d1c1d}1. Most cases have ~10% search latency reduction.{color}

{color:#1d1c1d}2. Some highlights (>20%):{color}
 * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
 ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 msec*{color}
 * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}

  was:
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis from {color:#1d1c1d}luceneutil. Attached the reports of 
baseline and candidates.{color}

{color:#1d1c1d}1. Most cases have ~15% latency reduction.{color}

{color:#1d1c1d}2. Some highlights (>20%):{color}
 * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
 ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in 726.81967 msec{color}
 ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in 484.544594 msec{color}
 * {color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in 95.698324 msec{color}
 ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in 78.336193 msec{color}


> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments section.
> {color}
> {color:#1d1c1d}1. Most cases have ~10% search latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits h

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiming Wu updated LUCENE-10624:

Description: 
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the reports 
of baseline and candidates in attachments section.{color}

{color:#1d1c1d}1. Most cases have 5-10% search latency reduction.{color}

{color:#1d1c1d}2. Some highlights (>20%):{color}
 * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
 ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 msec*{color}
 * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}
 * {color:#1d1c1d}*T1 cab_color:y cab_color:g sort=null*{color}
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *391.565239 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 300174+ hits hits in *227.592885 
msec*{color}{*}{*}
 * {color:#1d1c1d}*...*{color}

  was:
h3. Problem Statement

We noticed DocValue read performance regression with the iterative API when 
upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
degradation is similar to what's described in 
https://issues.apache.org/jira/browse/SOLR-9599 

By analyzing profiling data, we found method "advanceWithinBlock" and 
"advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
their O(N) doc lookup algorithm.
h3. Changes

Used binary search algorithm to replace current O(N) lookup algorithm in Sparse 
IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are 
in ascending order.
h3. Test
{code:java}
./gradlew tidy
./gradlew check {code}
h3. Benchmark

Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the reports 
of baseline and candidates in attachments section.
{color}

{color:#1d1c1d}1. Most cases have ~10% search latency reduction.{color}

{color:#1d1c1d}2. Some highlights (>20%):{color}
 * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
 ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 msec*{color}
 * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
 ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
 ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}


> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments sectio

[jira] [Comment Edited] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Weiming Wu (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556662#comment-17556662
 ] 

Weiming Wu edited comment on LUCENE-10624 at 6/21/22 6:16 AM:
--

Added benchmark data to the description.


was (Author: JIRAUSER290435):
Added benchmark data to the content.

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments section.{color}
> {color:#1d1c1d}1. Most cases have 5-10% search latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 
> msec*{color}
>  * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}
>  * {color:#1d1c1d}*T1 cab_color:y cab_color:g sort=null*{color}
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *391.565239 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 300174+ hits hits in *227.592885 
> msec*{color}{*}{*}
>  * {color:#1d1c1d}*...*{color}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-20 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556673#comment-17556673
 ] 

Adrien Grand commented on LUCENE-10624:
---

I find these speedups surprising since I was not expecting these queries to 
leverage doc values. The one query where I would expect a speedup is the term 
query sorted by field: 
http://people.apache.org/~mikemccand/lucenebench/sparseResults.html#search_sort_qps.

Regarding the implementation, in the past we observed better performance for 
this sort of things with exponential search than with binary search, since 
exponential search would better optimize for the case when callers repeatedly 
call advance() on small increments.

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> -
>
> Key: LUCENE-10624
> URL: https://issues.apache.org/jira/browse/LUCENE-10624
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 9.0, 9.1, 9.2
>Reporter: Weiming Wu
>Priority: Major
> Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments section.{color}
> {color:#1d1c1d}1. Most cases have 5-10% search latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 
> msec*{color}
>  * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}
>  * {color:#1d1c1d}*T1 cab_color:y cab_color:g sort=null*{color}
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *391.565239 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 300174+ hits hits in *227.592885 
> msec*{color}{*}{*}
>  * {color:#1d1c1d}*...*{color}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

[GitHub] [lucene] jpountz commented on pull request #967: LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

[GitHub] [lucene] jpountz merged pull request #965: LUCENE-10618: Implement BooleanQuery rewrite rules based for minimumShouldMatch

[jira] [Commented] (LUCENE-10618) Implement BooleanQuery rewrite rules based for minimumShouldMatch

[jira] [Resolved] (LUCENE-10618) Implement BooleanQuery rewrite rules based for minimumShouldMatch

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

[GitHub] [lucene] LuXugang commented on pull request #967: LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira

[GitHub] [lucene] jtibshirani commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Comment Edited] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

23 matches

Site Navigation

Mail list logo

Footer information