[GitHub] [lucene] msokolov commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
msokolov commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327858524 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] gsmiller merged pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller merged PR #12560: URL: https://github.com/apache/lucene/pull/12560 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[GitHub] [lucene] gsmiller commented on pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on PR #12560: URL: https://github.com/apache/lucene/pull/12560#issuecomment-1722012958 Thanks @zhaih / @msokolov ! Merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] epotyom commented on a diff in pull request #12555: Fix: Lucene90DocValuesProducer.TermsDict.seekCeil doesn't always position bytes correctly (#12167)

2023-09-15 Thread via GitHub
epotyom commented on code in PR #12555: URL: https://github.com/apache/lucene/pull/12555#discussion_r1327853406 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -1205,7 +1205,15 @@ public SeekStatus seekCeil(BytesRef text) throws I

[GitHub] [lucene] eraneverlaw opened a new issue, #12561: UAX29URLEmailTokenizerImpl.jflex matches emails with commas and invalid periods in the local part

2023-09-15 Thread via GitHub
eraneverlaw opened a new issue, #12561: URL: https://github.com/apache/lucene/issues/12561 ### Description The `UAX29URLEmailTokenizerImpl.jflex` code matches commas as part of email local part, as well as invalid leading, trailing, or consecutive periods. Examples of bad matches: `f

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327831100 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionValueSource.java: ## @@ -90,16 +90,24 @@ public DoubleValues getValues(LeafReaderContext reade

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327829161 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327828846 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327828413 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] msokolov commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
msokolov commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327797753 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] zhaih commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
zhaih commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327778019 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues {

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327725540 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] zhaih commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
zhaih commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327707056 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues {

[GitHub] [lucene] gsmiller commented on pull request #12559: Choose sparse values in IntTaxonomyFacets when FacetsCollector has em…

2023-09-15 Thread via GitHub
gsmiller commented on PR #12559: URL: https://github.com/apache/lucene/pull/12559#issuecomment-1721725750 It's still a bit unclear to me how we can get in a state where `maxDoc` is zero. Do we understand how this is happening? Seems like possibly a bigger/different issue that we should fix

[GitHub] [lucene] gsmiller commented on issue #12558: IntTaxonomyFacets chooses dense values array when FacetsCollector has no MatchingDocs

2023-09-15 Thread via GitHub
gsmiller commented on issue #12558: URL: https://github.com/apache/lucene/issues/12558#issuecomment-1721652878 Thanks for opening this issue @Shradha26! Do we have a test case that reproduces this? I'm still a little confused on how we can actually arrive in this state? -- This is an aut

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327624072 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] gsmiller commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327624072 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues

[GitHub] [lucene] zhaih commented on a diff in pull request #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
zhaih commented on code in PR #12560: URL: https://github.com/apache/lucene/pull/12560#discussion_r1327603749 ## lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionFunctionValues.java: ## @@ -39,21 +39,21 @@ class ExpressionFunctionValues extends DoubleValues {

[GitHub] [lucene] zhaih commented on pull request #12555: Fix: Lucene90DocValuesProducer.TermsDict.seekCeil doesn't always position bytes correctly (#12167)

2023-09-15 Thread via GitHub
zhaih commented on PR #12555: URL: https://github.com/apache/lucene/pull/12555#issuecomment-1721616521 Also pls add an entry to CHANGES.txt :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] zhaih commented on a diff in pull request #12555: Fix: Lucene90DocValuesProducer.TermsDict.seekCeil doesn't always position bytes correctly (#12167)

2023-09-15 Thread via GitHub
zhaih commented on code in PR #12555: URL: https://github.com/apache/lucene/pull/12555#discussion_r1327593126 ## lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestLucene90DocValuesFormat.java: ## @@ -958,4 +971,61 @@ public void testTermsEnumDictionary() throws IOExcept

[GitHub] [lucene] zhaih commented on a diff in pull request #12555: Fix: Lucene90DocValuesProducer.TermsDict.seekCeil doesn't always position bytes correctly (#12167)

2023-09-15 Thread via GitHub
zhaih commented on code in PR #12555: URL: https://github.com/apache/lucene/pull/12555#discussion_r1327591088 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -1205,7 +1205,15 @@ public SeekStatus seekCeil(BytesRef text) throws IOE

[GitHub] [lucene] gsmiller opened a new pull request, #12560: Defer #advanceExact on expression dependencies until their values are needed

2023-09-15 Thread via GitHub
gsmiller opened a new pull request, #12560: URL: https://github.com/apache/lucene/pull/12560 ### Description This extends the idea in GH#11878 to avoid advancing dependencies that are never referenced because of expression branching (i.e., ternary expressions). I think we should be

[GitHub] [lucene] mikemccand commented on a diff in pull request #12530: Fix CheckIndex to detect major corruption with old (not the latest) commit point

2023-09-15 Thread via GitHub
mikemccand commented on code in PR #12530: URL: https://github.com/apache/lucene/pull/12530#discussion_r1327380574 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -610,6 +610,39 @@ public Status checkIndex(List onlySegments, ExecutorService executorServ

[GitHub] [lucene] gokaai commented on a diff in pull request #12530: Fix CheckIndex to detect major corruption with old (not the latest) commit point

2023-09-15 Thread via GitHub
gokaai commented on code in PR #12530: URL: https://github.com/apache/lucene/pull/12530#discussion_r1327351474 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -610,6 +610,31 @@ public Status checkIndex(List onlySegments, ExecutorService executorServ

[GitHub] [lucene] gokaai commented on a diff in pull request #12530: Fix CheckIndex to detect major corruption with old (not the latest) commit point

2023-09-15 Thread via GitHub
gokaai commented on code in PR #12530: URL: https://github.com/apache/lucene/pull/12530#discussion_r1327351474 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -610,6 +610,31 @@ public Status checkIndex(List onlySegments, ExecutorService executorServ

[GitHub] [lucene] Shradha26 opened a new pull request, #12559: Choose sparse values in IntTaxonomyFacets when FacetsCollector has em…

2023-09-15 Thread via GitHub
Shradha26 opened a new pull request, #12559: URL: https://github.com/apache/lucene/pull/12559 …pty MatchingDocs ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] easyice opened a new pull request, #12557: Improve refresh speed with softdelete enable

2023-09-15 Thread via GitHub
easyice opened a new pull request, #12557: URL: https://github.com/apache/lucene/pull/12557 I found a flame graph in my production environment, the DocValuesConsumer for `___soft_deletes` field accounted for a large proportion ![image](https://github.com/apache/lucene/assets/23521001

[GitHub] [lucene] epotyom commented on a diff in pull request #12555: Fix: Lucene90DocValuesProducer.TermsDict.seekCeil doesn't always position bytes correctly (#12167)

2023-09-15 Thread via GitHub
epotyom commented on code in PR #12555: URL: https://github.com/apache/lucene/pull/12555#discussion_r1327195378 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -1205,7 +1205,15 @@ public SeekStatus seekCeil(BytesRef text) throws I

[GitHub] [lucene] iverase commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-15 Thread via GitHub
iverase commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1720945832 Thanks @jpountz and @uschindler for the input. I had a look into `RandomAccessInput` and I don't think this what we need. We need an DataInput that is positional ware so it supports seek