[GitHub] [lucene] fcofdez commented on a diff in pull request #11997: Add IntField, LongField, FloatField and DoubleField

2022-12-07 Thread GitBox
fcofdez commented on code in PR #11997: URL: https://github.com/apache/lucene/pull/11997#discussion_r1041891742 ## lucene/core/src/java/org/apache/lucene/document/DoubleField.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more +

[GitHub] [lucene] uschindler commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
uschindler commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340615081 Hi, I have no idea, what the reason for this issue/PR is (can I have a bit more information, is there an issue, too?), because MemoryIndex is mostly used for the old highlighting.

[GitHub] [lucene] jpountz commented on pull request #11995: draft pr

2022-12-07 Thread GitBox
jpountz commented on PR #11995: URL: https://github.com/apache/lucene/pull/11995#issuecomment-1340620512 > Thanks for your reply, does that mean I can add a new custom codec in Lucene? I was more thinking you could maintain a codec with your own stored fields format in your own codeb

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340643868 MemoryIndex is used by the monitor module and by the ES percolator as well, and that's where we may have issues - there are some slow queries in ES that use stored fields to search no

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340950877 > there are some slow queries in ES that use stored fields to search non-indexed documents, and they won't work when run against a MemoryIndex. Sorry, this seems like its the problem

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340958535 i hate to be that guy, but -1, this isn't a real use-case. we all know you shouldnt search on stored fields. shame on you. -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340992346 > you shouldnt search on stored fields It works surprisingly well as a confirmation phase in a two-phase query! The fact that memory indexes don't support stored fields al

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1340995971 i'm sticking with my -1. i will also keep close watch on all stored fields issues now, to try to make sure it isn't for "searching abuse cases" like this. I think, maybe we should us

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341120986 To clarify, is your objection to adding support for `document()` to the only `LeafReader` implementation that doesn't have it, or is it that `document()` is a method on `LeafReader` i

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341154505 it doesn't need stored fields because stored fields aren't for searching on. PERIOD. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341167622 You don't have an objection to them being used for highlighting though, right? In the elasticsearch percolator we have a bunch of workarounds to fetch stored fields from elsewhere whe

[GitHub] [lucene] uschindler commented on a diff in pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
uschindler commented on code in PR #11999: URL: https://github.com/apache/lucene/pull/11999#discussion_r1042400407 ## lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java: ## @@ -1797,9 +1825,34 @@ public int maxDoc() { } @Override -public void

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341220984 If you want to search on a field, you index it. This is lucene 101. I have to push back on the idea that its ok to search on the stored fields: it isn't! I will be watching closely n

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341240534 > The stored fields are intended for returning the top-N summary results, and code around them should be optimized just for that. And with this change you can now use a MemoryIn

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341244642 It isn't needed, because the memoryindex only really needs the postings to match what the query is doing. The highlighter has been working for years with memoryindex, without storedfields

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341245393 and there's no way in hell you can convince me that we should allow it. we should absolutely not support such nonsense in lucene. -- This is an automated message from the Apache Git Serv

[GitHub] [lucene] gvaysman-at-github opened a new issue, #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2022-12-07 Thread GitBox
gvaysman-at-github opened a new issue, #12000: URL: https://github.com/apache/lucene/issues/12000 ### Description When the web application that uses Lucene Facets is shut down (say, from Tomcat Manager app), ThreadLocal from UTF8TaxonomyWriterCache (the field bytes) is never cleaned

[GitHub] [lucene] rmuir commented on issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2022-12-07 Thread GitBox
rmuir commented on issue #12000: URL: https://github.com/apache/lucene/issues/12000#issuecomment-1341271839 Thanks @gvaysman-at-github for reporting this. I was unaware of this threadlocal, indeed it looks problematic just from the description in this cache: ``` /** A "cache" that

[GitHub] [lucene] rmuir commented on issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2022-12-07 Thread GitBox
rmuir commented on issue #12000: URL: https://github.com/apache/lucene/issues/12000#issuecomment-1341280462 The UTF8TaxonomyWriterCache does have a `public void close() {}` but I'm not sure when/if it gets invoked as I'm not familiar with this cache and will need to look more into it. But o

[GitHub] [lucene] romseygeek commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
romseygeek commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341283678 The elasticsearch highlighter for the percolator has been working for years with a really annoying workaround, because MemoryIndex is the only LeafReader implementation in lucene that

[GitHub] [lucene] rmuir commented on issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2022-12-07 Thread GitBox
rmuir commented on issue #12000: URL: https://github.com/apache/lucene/issues/12000#issuecomment-1341291174 A possible workaround for now might be to configure `LruTaxonomyWriterCache` instead of the default `UTF8TaxonomyWriterCache`. The LRU-based implementation doesn't use any threadlocal

[GitHub] [lucene] rmuir commented on pull request #11999: Add support for stored fields to MemoryIndex

2022-12-07 Thread GitBox
rmuir commented on PR #11999: URL: https://github.com/apache/lucene/pull/11999#issuecomment-1341303231 > The elasticsearch highlighter for the percolator has been working for years with a really annoying workaround, because MemoryIndex is the only LeafReader implementation in lucene that do

[GitHub] [lucene] msokolov commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-12-07 Thread GitBox
msokolov commented on code in PR #11860: URL: https://github.com/apache/lucene/pull/11860#discussion_r1042658193 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/ExpandingVectorValues.java: ## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] msokolov commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-12-07 Thread GitBox
msokolov commented on code in PR #11860: URL: https://github.com/apache/lucene/pull/11860#discussion_r1042658193 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/ExpandingVectorValues.java: ## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] benwtrent commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-12-07 Thread GitBox
benwtrent commented on code in PR #11860: URL: https://github.com/apache/lucene/pull/11860#discussion_r1042673843 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/ExpandingVectorValues.java: ## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] jpountz commented on pull request #11972: Generalize range query optimization on sorted indexes to descending sorts.

2022-12-07 Thread GitBox
jpountz commented on PR #11972: URL: https://github.com/apache/lucene/pull/11972#issuecomment-1342208682 It is a bit tricky indeed, thanks for having a look @gsmiller ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [lucene] jpountz merged pull request #11972: Generalize range query optimization on sorted indexes to descending sorts.

2022-12-07 Thread GitBox
jpountz merged PR #11972: URL: https://github.com/apache/lucene/pull/11972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] jpountz merged pull request #11964: Make RandomAccessVectorValues an implementation detail of HNSW implementations rather than a proper API.

2022-12-07 Thread GitBox
jpountz merged PR #11964: URL: https://github.com/apache/lucene/pull/11964 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] jpountz closed issue #10623: How should we expose VectorValues.RandomAccess? [LUCENE-9583]

2022-12-07 Thread GitBox
jpountz closed issue #10623: How should we expose VectorValues.RandomAccess? [LUCENE-9583] URL: https://github.com/apache/lucene/issues/10623 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec