[GitHub] [lucene] jasirkt opened a new issue, #12195: Slop is missing when boost is passed to MultiFieldQueryParser (Since Lucene 5.4.0)

2023-03-08 Thread via GitHub
jasirkt opened a new issue, #12195: URL: https://github.com/apache/lucene/issues/12195 ### Description On Lucene 5.3.2, If I run ```java String[] fields = new String[]{ "field1"}; Analyzer analyzer = new StandardAnalyzer(); Map boosts = Map.of("field1", 1.5f); MultiFiel

[GitHub] [lucene] zacharymorn commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1129100291 ## lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java: ## @@ -286,6 +286,33 @@ public int nextSetBit(int index) { return DocIdSetIterator.NO_MORE_DO

[GitHub] [lucene] uschindler commented on pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
uschindler commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1459710764 There are no tests about correctness of those new BitSet methods for any implementation (Fixed, Sparse,...). Would it be possible to add them? -- This is an automated message from t

[GitHub] [lucene] zacharymorn commented on issue #11915: Make Lucene smarter about long runs of matches

2023-03-08 Thread via GitHub
zacharymorn commented on issue #11915: URL: https://github.com/apache/lucene/issues/11915#issuecomment-1459711628 Yes please go ahead @mdmarshmallow ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [lucene] uschindler commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
uschindler commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1129102882 ## lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java: ## @@ -286,6 +286,33 @@ public int nextSetBit(int index) { return DocIdSetIterator.NO_MORE_DOC

[GitHub] [lucene] zacharymorn commented on pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1459726170 > There are no tests about correctness of those new BitSet methods for any implementation (Fixed, Sparse,...). Would it be possible to add them? Thanks for the review and comme

[GitHub] [lucene] rmuir closed issue #12193: FieldInfo#attributes should be exposed as variables instead of map

2023-03-08 Thread via GitHub
rmuir closed issue #12193: FieldInfo#attributes should be exposed as variables instead of map URL: https://github.com/apache/lucene/issues/12193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [lucene] rmuir commented on issue #12193: FieldInfo#attributes should be exposed as variables instead of map

2023-03-08 Thread via GitHub
rmuir commented on issue #12193: URL: https://github.com/apache/lucene/issues/12193#issuecomment-1459998056 20,000 fields is the problem, not FieldInfo's attributes map. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [lucene] rmuir commented on pull request #15: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation

2023-03-08 Thread via GitHub
rmuir commented on PR #15: URL: https://github.com/apache/lucene/pull/15#issuecomment-1460005723 we should also be careful about introducing complex CharFilters, I consider the current CharFilter api broken after debugging #11976 see https://github.com/apache/lucene/issues/11976#issu

[GitHub] [lucene] uschindler commented on pull request #12188: Alternative version: Implement MMapDirectory with Java 19/20 Project Panama Preview API

2023-03-08 Thread via GitHub
uschindler commented on PR #12188: URL: https://github.com/apache/lucene/pull/12188#issuecomment-1460099066 FYI, I also checked this on the branch_9x with Java 11. It works the same way, the only issue is that the apijar files need to contain also signatures of `java.util.Objects`, becau

[GitHub] [lucene] jasirkt commented on issue #12195: Slop is missing when boost is passed to MultiFieldQueryParser (Since Lucene 5.4.0)

2023-03-08 Thread via GitHub
jasirkt commented on issue #12195: URL: https://github.com/apache/lucene/issues/12195#issuecomment-1460216562 Tracked down the issue to this: In Lucene 5.4.0 [this commit](https://github.com/apache/lucene/commit/962313b83ba9c69379e1f84dffc881a361713ce9#diff-e10af886a9a7ba5221abfcfbe9dc057d2c

[GitHub] [lucene] jasirkt opened a new pull request, #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
jasirkt opened a new pull request, #12196: URL: https://github.com/apache/lucene/pull/12196 ### Description This change fixes #12195. In Lucene 5.4.0 [this commit](https://github.com/apache/lucene/commit/962313b83ba9c69379e1f84dffc881a361713ce9#diff-e10af886a9a7ba5221abfcfbe9dc

[GitHub] [lucene] rmuir commented on pull request #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
rmuir commented on PR #12196: URL: https://github.com/apache/lucene/pull/12196#issuecomment-1460225168 thank you for tracking this down: is there a simple unit test we could add for the change? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] jasirkt commented on pull request #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
jasirkt commented on PR #12196: URL: https://github.com/apache/lucene/pull/12196#issuecomment-1460262146 @rmuir I've added a unit test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [lucene] rmuir merged pull request #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
rmuir merged PR #12196: URL: https://github.com/apache/lucene/pull/12196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] rmuir closed issue #12195: Slop is missing when boost is passed to MultiFieldQueryParser (Since Lucene 5.4.0)

2023-03-08 Thread via GitHub
rmuir closed issue #12195: Slop is missing when boost is passed to MultiFieldQueryParser (Since Lucene 5.4.0) URL: https://github.com/apache/lucene/issues/12195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [lucene] jpountz commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
jpountz commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1129172572 ## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ## @@ -211,4 +216,22 @@ protected final int slowAdvance(int target) throws IOException { *

[GitHub] [lucene] jpountz commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
jpountz commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1129692954 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -414,6 +416,13 @@ public int nextDoc() throws IOException { retu

[GitHub] [lucene] rmuir commented on pull request #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
rmuir commented on PR #12196: URL: https://github.com/apache/lucene/pull/12196#issuecomment-1460451659 Thank you for the fix @jasirkt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] jpountz commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
jpountz commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1129173207 ## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ## @@ -82,6 +82,11 @@ public int advance(int target) throws IOException { return doc;

[GitHub] [lucene] mulugetam commented on issue #12091: Speeding up Lucene Vector Similarity through the Java Vector API

2023-03-08 Thread via GitHub
mulugetam commented on issue #12091: URL: https://github.com/apache/lucene/issues/12091#issuecomment-1460638931 @rmuir is there a way I could try to include this and https://github.com/apache/lucene/issues/12090 as an experimental/sandbox plugin? How do we currently test out experimental fe

[GitHub] [lucene] zhaih commented on issue #12176: TermInSetQuery could use (variant of) DaciukMihov/Terms.intersect() for faster intersection

2023-03-08 Thread via GitHub
zhaih commented on issue #12176: URL: https://github.com/apache/lucene/issues/12176#issuecomment-1460914565 Hey Robert this is an interesting idea, one of the problem we're facing seems related to this idea: we're having ~200 terms from several fields and we're trying to do a big disjunc

[GitHub] [lucene] rmuir commented on issue #12176: TermInSetQuery could use (variant of) DaciukMihov/Terms.intersect() for faster intersection

2023-03-08 Thread via GitHub
rmuir commented on issue #12176: URL: https://github.com/apache/lucene/issues/12176#issuecomment-1460989224 so, one thing is, Terms.intersect() works across a single field. and you definitely have to sort before adding terms to DaciukMihov (but then it works in linear time). Sounds

[GitHub] [lucene] jasirkt commented on pull request #12196: Fix Slop Issue in MultiFieldQueryParser

2023-03-08 Thread via GitHub
jasirkt commented on PR #12196: URL: https://github.com/apache/lucene/pull/12196#issuecomment-1461195438 Welcome @rmuir -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] zacharymorn commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1130391527 ## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ## @@ -211,4 +216,22 @@ protected final int slowAdvance(int target) throws IOException {

[GitHub] [lucene] zacharymorn commented on pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1461257536 Thanks @jpountz for the review and comment! >Did you manage to observe some speedups with this change? So far I have only able to run `wikimedium10m` and see the impleme

[GitHub] [lucene] uschindler commented on pull request #12042: Implement MMapDirectory with Java 20 Project Panama Preview API

2023-03-08 Thread via GitHub
uschindler commented on PR #12042: URL: https://github.com/apache/lucene/pull/12042#issuecomment-1461461593 Closing in favor of #12188. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [lucene] uschindler closed pull request #12042: Implement MMapDirectory with Java 20 Project Panama Preview API

2023-03-08 Thread via GitHub
uschindler closed pull request #12042: Implement MMapDirectory with Java 20 Project Panama Preview API URL: https://github.com/apache/lucene/pull/12042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [lucene] zacharymorn commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1130592912 ## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ## @@ -211,4 +216,22 @@ protected final int slowAdvance(int target) throws IOException {

[GitHub] [lucene] zacharymorn commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1130593481 ## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ## @@ -82,6 +82,11 @@ public int advance(int target) throws IOException { return d

[GitHub] [lucene] zacharymorn commented on a diff in pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

2023-03-08 Thread via GitHub
zacharymorn commented on code in PR #12194: URL: https://github.com/apache/lucene/pull/12194#discussion_r1130594263 ## lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java: ## @@ -286,6 +286,33 @@ public int nextSetBit(int index) { return DocIdSetIterator.NO_MORE_DO