Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537183041 ## lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldMergeState.java: ## @@ -31,40 +31,18 @@ import org.apache.lucene.index.MergeState; import org.apach

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537197013 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext(fa

Re: [PR] Reduce some unnecessary ArrayUtil#grow calls [lucene]

2024-03-25 Thread via GitHub
stefanvodita commented on PR #13171: URL: https://github.com/apache/lucene/pull/13171#issuecomment-2017455119 @easyice - I'm probably missing something obvious, but I thought `ArrayUtil.grow` already made the same check. Can you explain why this PR helps? https://github.com/apache/lucene

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-25 Thread via GitHub
stefanvodita commented on code in PR #12172: URL: https://github.com/apache/lucene/pull/12172#discussion_r1537211363 ## lucene/analysis/common/src/resources/org/apache/lucene/analysis/ro/stopwords.txt: ## @@ -190,27 +207,34 @@ sale sau său se +și şi sînt sîntem +sînteți R

Re: [PR] Fix TestTaxonomyFacetValueSource.testRandom [lucene]

2024-03-25 Thread via GitHub
stefanvodita commented on PR #13198: URL: https://github.com/apache/lucene/pull/13198#issuecomment-2017533598 Aren't we changing the random number generation when we add the merge policy, so we're no longer producing a failing case by chance? -- This is an automated message from the Apach

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537285277 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537287589 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537311661 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext(fa

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537345317 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Fix TestTaxonomyFacetValueSource.testRandom [lucene]

2024-03-25 Thread via GitHub
dweiss commented on PR #13198: URL: https://github.com/apache/lucene/pull/13198#issuecomment-2017652016 Whenever you touch the random number generator, it'll change anything down from there. Reiterate/Beast your tests to find a new offending seed (or improve the probability your change fixe

Re: [PR] Speed up writeGroupVInts [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13203: URL: https://github.com/apache/lucene/pull/13203#discussion_r1537354439 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -111,4 +112,40 @@ public static int readGroupVInt( pos += 1 + n4Minus1; return (int)

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537360487 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537361582 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537361582 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-25 Thread via GitHub
strainu commented on code in PR #12172: URL: https://github.com/apache/lucene/pull/12172#discussion_r1537383981 ## lucene/analysis/common/src/resources/org/apache/lucene/analysis/ro/stopwords.txt: ## @@ -190,27 +207,34 @@ sale sau său se +și şi sînt sîntem +sînteți Review

Re: [PR] Subtract deleted file size from the cache size of NRTCachingDirectory. [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13206: URL: https://github.com/apache/lucene/pull/13206#issuecomment-2017708203 Hi, > There is a weakness in the fix if the file is concurrently deleted in another thread while being closed I don't think that adding synchronization for that is really

Re: [PR] Subtract deleted file size from the cache size of NRTCachingDirectory. [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13206: URL: https://github.com/apache/lucene/pull/13206#issuecomment-2017717168 Actually the delete while it gets written to should never appear in Lucene. The bigger problem is when the file is still open in an NRTReader and gets deleted. I am not sure how

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-25 Thread via GitHub
stefanvodita commented on code in PR #12172: URL: https://github.com/apache/lucene/pull/12172#discussion_r1537416729 ## lucene/analysis/common/src/resources/org/apache/lucene/analysis/ro/stopwords.txt: ## @@ -190,27 +207,34 @@ sale sau său se +și şi sînt sîntem +sînteți R

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-25 Thread via GitHub
strainu commented on code in PR #12172: URL: https://github.com/apache/lucene/pull/12172#discussion_r1537424935 ## lucene/analysis/common/src/resources/org/apache/lucene/analysis/ro/stopwords.txt: ## @@ -190,27 +207,34 @@ sale sau său se +și şi sînt sîntem +sînteți Review

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-25 Thread via GitHub
strainu commented on code in PR #12172: URL: https://github.com/apache/lucene/pull/12172#discussion_r1537424935 ## lucene/analysis/common/src/resources/org/apache/lucene/analysis/ro/stopwords.txt: ## @@ -190,27 +207,34 @@ sale sau său se +și şi sînt sîntem +sînteți Review

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-25 Thread via GitHub
kaivalnp commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1537438862 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Made the UnifiedHighlighter's hasUnrecognizedQuery function processes FunctionQuery the same way as MatchAllDocsQuery and MatchNoDocsQuery queries for performance reasons. [lucene]

2024-03-25 Thread via GitHub
romseygeek commented on code in PR #13165: URL: https://github.com/apache/lucene/pull/13165#discussion_r1537485874 ## lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java: ## @@ -1130,7 +1134,16 @@ public boolean acceptField(String field) {

Re: [PR] upgrade snowball to 26db1ab9adbf437f37a6facd3ee2aad1da9eba03 [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13209: URL: https://github.com/apache/lucene/pull/13209#discussion_r1537494919 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/snowball/SnowballFilter.java: ## @@ -70,6 +70,11 @@ public SnowballFilter(TokenStream input, SnowballS

[I] Replace boolean flags on MergeContext with an enum [lucene]

2024-03-25 Thread via GitHub
jpountz opened a new issue, #13211: URL: https://github.com/apache/lucene/issues/13211 ### Description `MergeContext` has a few boolean flags: `readOnce`, `load`, `randomAccess`. But some combinations of these flags don't make sense, e.g. something can't be `readOnce` and `randomAcce

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537555897 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext(fa

Re: [PR] Add WrappedCandidateMatcher for composing matchers [lucene]

2024-03-25 Thread via GitHub
bjacobowitz commented on PR #13109: URL: https://github.com/apache/lucene/pull/13109#issuecomment-2017958689 > The `protected` visibility on `matchQuery()` should already be fine here - you can override or call protected methods from within subclasses. I think making `reportError()` and `fi

Re: [PR] Subtract deleted file size from the cache size of NRTCachingDirectory. [lucene]

2024-03-25 Thread via GitHub
jpountz commented on PR #13206: URL: https://github.com/apache/lucene/pull/13206#issuecomment-2017975876 > The bigger problem is when the file is still open in an NRTReader and gets deleted. Hmm does it actually happen? I thought index files were ref-counted so that files would only

Re: [PR] upgrade snowball to 26db1ab9adbf437f37a6facd3ee2aad1da9eba03 [lucene]

2024-03-25 Thread via GitHub
rmuir commented on code in PR #13209: URL: https://github.com/apache/lucene/pull/13209#discussion_r1537576805 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/snowball/SnowballPorterFilterFactory.java: ## @@ -59,7 +59,13 @@ public class SnowballPorterFilterFactory e

Re: [PR] Fix demo application, if no knn dict was provided and over 100 docs are indexed [lucene]

2024-03-25 Thread via GitHub
msokolov merged PR #13163: URL: https://github.com/apache/lucene/pull/13163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [I] Demo application throws an exception when following documentation. [lucene]

2024-03-25 Thread via GitHub
msokolov closed issue #12330: Demo application throws an exception when following documentation. URL: https://github.com/apache/lucene/issues/12330 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Fix demo application, if no knn dict was provided and over 100 docs are indexed [lucene]

2024-03-25 Thread via GitHub
msokolov commented on PR #13163: URL: https://github.com/apache/lucene/pull/13163#issuecomment-2018132006 I also cherry-picked to branch_9x: 3d109bdc493b451f967a7ad3102267c9d177ade0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
dnhatn commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537724985 ## lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldMergeState.java: ## @@ -31,40 +31,18 @@ import org.apache.lucene.index.MergeState; import org.apache

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
dnhatn commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537724628 ## lucene/core/src/java/org/apache/lucene/index/MergeState.java: ## @@ -23,13 +23,7 @@ import java.util.Locale; import java.util.concurrent.Executor; import java.uti

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537732974 ## lucene/core/src/java/org/apache/lucene/index/MergeState.java: ## @@ -266,4 +266,35 @@ static PackedLongValues removeDeletes(final int maxDoc, final Bits liveDocs)

Re: [PR] Speed up writeGroupVInts [lucene]

2024-03-25 Thread via GitHub
easyice commented on code in PR #13203: URL: https://github.com/apache/lucene/pull/13203#discussion_r1537749847 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -111,4 +112,40 @@ public static int readGroupVInt( pos += 1 + n4Minus1; return (int)

Re: [PR] Speed up writeGroupVInts [lucene]

2024-03-25 Thread via GitHub
easyice commented on PR #13203: URL: https://github.com/apache/lucene/pull/13203#issuecomment-2018225641 Thanks for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
dnhatn commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537756776 ## lucene/core/src/java/org/apache/lucene/index/MergeState.java: ## @@ -266,4 +266,35 @@ static PackedLongValues removeDeletes(final int maxDoc, final Bits liveDocs) {

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13208: URL: https://github.com/apache/lucene/pull/13208#discussion_r1537788640 ## lucene/core/src/java/org/apache/lucene/index/MergeState.java: ## @@ -266,4 +266,35 @@ static PackedLongValues removeDeletes(final int maxDoc, final Bits liveDocs)

Re: [PR] Reduce some unnecessary ArrayUtil#grow calls [lucene]

2024-03-25 Thread via GitHub
easyice commented on PR #13171: URL: https://github.com/apache/lucene/pull/13171#issuecomment-2018304676 @stefanvodita Thank you for looking into this! when an array does not need to grow, The `ArrayUtil.grow` will return the input `array`, then it will call an extra assignment like `ref.b

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on code in PR #13205: URL: https://github.com/apache/lucene/pull/13205#discussion_r1537821986 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -62,91 +55,36 @@ public enum Context { public static final IOContext LOAD = new IOContext

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13196: URL: https://github.com/apache/lucene/pull/13196#issuecomment-2018328574 @jpountz Are you fine with merging? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
dnhatn merged PR #13208: URL: https://github.com/apache/lucene/pull/13208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Avoid modify merge state in per field mergers (#13208) [lucene]

2024-03-25 Thread via GitHub
dnhatn merged PR #13212: URL: https://github.com/apache/lucene/pull/13212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] Regarding the frequency used for scoring sloppy phrase queries. [lucene]

2024-03-25 Thread via GitHub
odelmarcelle commented on issue #13152: URL: https://github.com/apache/lucene/issues/13152#issuecomment-2018475392 That's my personal opinion, but given that it's up to the user to allow sloppy matches, I question why the score penalty even exists. If a user wants to rank higher exact match

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
benwtrent commented on PR #13208: URL: https://github.com/apache/lucene/pull/13208#issuecomment-2018483694 Huzzah! Thanks @dnhatn @jpountz !!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory [lucene]

2024-03-25 Thread via GitHub
uschindler merged PR #13196: URL: https://github.com/apache/lucene/pull/13196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[PR] Add support for posix_madvise to Java 21 MMapDirectory (backport) [lucene]

2024-03-25 Thread via GitHub
uschindler opened a new pull request, #13213: URL: https://github.com/apache/lucene/pull/13213 backport of #13196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Avoid modify merge state in per field mergers [lucene]

2024-03-25 Thread via GitHub
mikemccand commented on PR #13208: URL: https://github.com/apache/lucene/pull/13208#issuecomment-2018667449 Thank you @dnhatn! I've kicked off a one-off nightly benchy run ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13196: URL: https://github.com/apache/lucene/pull/13196#issuecomment-2018749550 The test added by @ChrisHegarty sometimes fails on windows: It does not close the file it opened for random access testing, so the directory can't be deleted. Will fix this in a separ

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory (backport) [lucene]

2024-03-25 Thread via GitHub
uschindler merged PR #13213: URL: https://github.com/apache/lucene/pull/13213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13196: URL: https://github.com/apache/lucene/pull/13196#issuecomment-2018772367 I fixed the test in https://github.com/apache/lucene/commit/ae5d3534e3ef44ff3336dea0308d8a82f0672ff2 -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Made the UnifiedHighlighter's hasUnrecognizedQuery function processes FunctionQuery the same way as MatchAllDocsQuery and MatchNoDocsQuery queries for performance reasons. [lucene]

2024-03-25 Thread via GitHub
vletard commented on code in PR #13165: URL: https://github.com/apache/lucene/pull/13165#discussion_r1538165525 ## lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java: ## @@ -1130,7 +1134,16 @@ public boolean acceptField(String field) {

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-25 Thread via GitHub
uschindler commented on PR #13205: URL: https://github.com/apache/lucene/pull/13205#issuecomment-2018817418 Hi @ChrisHegarty, can you have another look on the crazy randomAccess flag afetr I merged main into this. Especially the checks in the record's constructor should be checked again.

Re: [PR] Reduce some unnecessary ArrayUtil#grow calls [lucene]

2024-03-25 Thread via GitHub
stefanvodita commented on PR #13171: URL: https://github.com/apache/lucene/pull/13171#issuecomment-2018978359 Thanks for the explanation! The way I think about it is whether it would be ok to make this check every time we call `grow`. Probably not, it's nice that `grow` does the check for y

Re: [PR] Made DocIdsWriter use DISI when reading documents with an IntersectVisitor [lucene]

2024-03-25 Thread via GitHub
jpountz commented on code in PR #13149: URL: https://github.com/apache/lucene/pull/13149#discussion_r1538276505 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -222,6 +230,14 @@ public void visit(DocIdSetIterator iterator) throws IOException {

Re: [I] Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager) [LUCENE-10002] [lucene]

2024-03-25 Thread via GitHub
wardle commented on issue #11041: URL: https://github.com/apache/lucene/issues/11041#issuecomment-2018982902 Hi all. There is a small overhead with using CollectorManager over Collector. In my own usage, I have a read-only index which uses only a single slice, and I've chosen not to provide