[GitHub] [lucene] LuXugang commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

2022-12-15 Thread GitBox
LuXugang commented on code in PR #12017: URL: https://github.com/apache/lucene/pull/12017#discussion_r1049162986 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -1015,6 +1015,80 @@ public void testDisjunctionRandomClausesMatchesCount() throws Excep

[GitHub] [lucene] craigtaverner opened a new issue, #12020: Very flat polygons give incorrect 'contains' result

2022-12-15 Thread GitBox
craigtaverner opened a new issue, #12020: URL: https://github.com/apache/lucene/issues/12020 ### Description When performing a search using a shape geometry query of relation type `QueryRelation.CONTAINS`, it is possible to get a false positive when two geometries intersect, but neit

[GitHub] [lucene] nosvalds opened a new issue, #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
nosvalds opened a new issue, #12021: URL: https://github.com/apache/lucene/issues/12021 ### Description ## Issue For fields using `large="true"`, large fields (which is what they are intended for) can be truncated in v9+ of Lucene. Example fieldtype definition: ```

[GitHub] [lucene] craigtaverner opened a new pull request, #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
craigtaverner opened a new pull request, #12022: URL: https://github.com/apache/lucene/pull/12022 Fixes https://github.com/apache/lucene/issues/12020 ### Description When performing a search using a shape geometry query of relation type `QueryRelation.CONTAINS`, it is possible

[GitHub] [lucene] iverase commented on a diff in pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase commented on code in PR #12022: URL: https://github.com/apache/lucene/pull/12022#discussion_r1049521314 ## lucene/CHANGES.txt: ## @@ -68,6 +68,8 @@ Bug Fixes * LUCENE-10599: LogMergePolicy is more likely to keep merging segments until they reach the maximum merge siz

[GitHub] [lucene] iverase commented on a diff in pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase commented on code in PR #12022: URL: https://github.com/apache/lucene/pull/12022#discussion_r1049521314 ## lucene/CHANGES.txt: ## @@ -68,6 +68,8 @@ Bug Fixes * LUCENE-10599: LogMergePolicy is more likely to keep merging segments until they reach the maximum merge siz

[GitHub] [lucene] rmuir commented on issue #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
rmuir commented on issue #12021: URL: https://github.com/apache/lucene/issues/12021#issuecomment-1352977757 This looks like a bug in solr code (SolrDocumentFetcher) so I'd recommend opening a bug over at https://github.com/apache/solr -- This is an automated message from the Apache Git Se

[GitHub] [lucene] Bukhtawar opened a new issue, #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
Bukhtawar opened a new issue, #12023: URL: https://github.com/apache/lucene/issues/12023 ### Description As a part of https://github.com/opensearch-project/OpenSearch/issues/687 we detected that regex queries can run into tight loops for quite long. Below is the stack trace of the re

[GitHub] [lucene] rmuir closed issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir closed issue #12023: Mechanism to interrupt long-running/resource intensive queries URL: https://github.com/apache/lucene/issues/12023 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [lucene] rmuir commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353025015 determinization has already been removed here. that is the problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [lucene] craigtaverner commented on a diff in pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
craigtaverner commented on code in PR #12022: URL: https://github.com/apache/lucene/pull/12022#discussion_r1049613431 ## lucene/CHANGES.txt: ## @@ -68,6 +68,8 @@ Bug Fixes * LUCENE-10599: LogMergePolicy is more likely to keep merging segments until they reach the maximum mer

[GitHub] [lucene] Bukhtawar commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
Bukhtawar commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353034054 Thanks @rmuir I am aware this has been addressed, this issue was to primarily gather thoughts on other possible queries similar to this that might be expensive or running tight lo

[GitHub] [lucene] benwtrent opened a new pull request, #12024: Fix SimpleTextKnnVectorsReader to handle changes introduced in GITHUB#12004

2022-12-15 Thread GitBox
benwtrent opened a new pull request, #12024: URL: https://github.com/apache/lucene/pull/12024 `SimpleTextKnnVectorsReader` is used for recreation and testing. It needs to handle the new way of searching with BytesRef directly instead of always searching with `float`. -- This is an automa

[GitHub] [lucene] rmuir commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353034922 not going to support Thread.interrupt or any nonsense like that. you already have the exitable reader: use that -- This is an automated message from the Apache Git Service. To respo

[GitHub] [lucene] rmuir commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353044756 And the reason i am short with you, again, is because you still implement a garbage security model (no authentication required by default). Stop shipping insecure apps and you'l

[GitHub] [lucene] jpountz commented on a diff in pull request #12024: Fix SimpleTextKnnVectorsReader to handle changes introduced in GITHUB#12004

2022-12-15 Thread GitBox
jpountz commented on code in PR #12024: URL: https://github.com/apache/lucene/pull/12024#discussion_r1049643447 ## lucene/core/src/test/org/apache/lucene/search/TestVectorScorer.java: ## @@ -36,13 +36,26 @@ public class TestVectorScorer extends LuceneTestCase { public void

[GitHub] [lucene] nosvalds commented on issue #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
nosvalds commented on issue #12021: URL: https://github.com/apache/lucene/issues/12021#issuecomment-1353077520 Sorry about that looks like the code link I had was from before the split. Moved this issue here: https://issues.apache.org/jira/browse/SOLR-16589 -- This is an automated message

[GitHub] [lucene] nosvalds closed issue #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
nosvalds closed issue #12021: Large fields with large="true" can be truncated in v9+ URL: https://github.com/apache/lucene/issues/12021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [lucene] benwtrent commented on a diff in pull request #12024: Fix SimpleTextKnnVectorsReader to handle changes introduced in GITHUB#12004

2022-12-15 Thread GitBox
benwtrent commented on code in PR #12024: URL: https://github.com/apache/lucene/pull/12024#discussion_r1049648525 ## lucene/core/src/test/org/apache/lucene/search/TestVectorScorer.java: ## @@ -36,13 +36,26 @@ public class TestVectorScorer extends LuceneTestCase { public vo

[GitHub] [lucene] jpountz merged pull request #12024: Fix SimpleTextKnnVectorsReader to handle changes introduced in GITHUB#12004

2022-12-15 Thread GitBox
jpountz merged PR #12024: URL: https://github.com/apache/lucene/pull/12024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] iverase merged pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase merged PR #12022: URL: https://github.com/apache/lucene/pull/12022 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] iverase closed issue #12020: Very flat polygons give incorrect 'contains' result

2022-12-15 Thread GitBox
iverase closed issue #12020: Very flat polygons give incorrect 'contains' result URL: https://github.com/apache/lucene/issues/12020 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [lucene] iverase commented on pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase commented on PR #12022: URL: https://github.com/apache/lucene/pull/12022#issuecomment-1353129113 Thanks @craigtaverner! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [lucene] Bukhtawar commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
Bukhtawar commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353148073 Maybe will discuss the security part separately, but agree, one idea is to detect such queries and prevent running these queries in the first place, in this case(not the original

[GitHub] [lucene] uschindler commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-15 Thread GitBox
uschindler commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1353249709 Cool, thanks for the "huge whitespace" test! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-15 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1353254656 @rmuir @uschindler thanks a lot for HUGE help here guys! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [lucene] msokolov commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-15 Thread GitBox
msokolov commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1049804451 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { * @throw

[GitHub] [lucene] msokolov commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
msokolov commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353313592 Q: are you aware of https://github.com/apache/lucene/issues/11188? It's a fair question whether `ExitableDirectoryReader` is adequate for catching all runaway queries. There can be

[GitHub] [lucene] benwtrent commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-15 Thread GitBox
benwtrent commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1049904819 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { * @thro

[GitHub] [lucene] msokolov commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-15 Thread GitBox
msokolov commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1050171394 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { * @throw

[GitHub] [lucene] rmuir commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-15 Thread GitBox
rmuir commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1050228933 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { * @throws I

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-15 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1354141954 I forced regeneration with `./gradlew -p lucene/expressions regenerate --rerun-tasks` just to ensure there were no source code changes and regeneration is idempotent / reproducible. --

[GitHub] [lucene] rmuir merged pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-15 Thread GitBox
rmuir merged PR #12016: URL: https://github.com/apache/lucene/pull/12016 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] rmuir closed issue #11788: Upgrade ANTLR to version 4.11.1

2022-12-15 Thread GitBox
rmuir closed issue #11788: Upgrade ANTLR to version 4.11.1 URL: https://github.com/apache/lucene/issues/11788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [lucene] rmuir commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1354168057 > Maybe will discuss the security part separately, but agree, one idea is to detect such queries and prevent running these queries in the first place, in this case(not the original is

[GitHub] [lucene] Bukhtawar commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
Bukhtawar commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1354235380 This specific case although the stack trace might appear the same wasn't a regex query but a wildcard and fuzzy query. I will share the redacted request -- This is an automated