Re: [PR] Avoid unnecessary memory allocation in PackedLongValues#Iterator [lucene]

2024-06-03 Thread via GitHub
easyice merged PR #13439: URL: https://github.com/apache/lucene/pull/13439 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Avoid SegmentTermsEnumFrame reload block. [lucene]

2024-06-03 Thread via GitHub
vsop-479 commented on code in PR #13253: URL: https://github.com/apache/lucene/pull/13253#discussion_r1625269547 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ## @@ -434,8 +436,29 @@ public boolean seekExact(BytesRef target) throws I

Re: [PR] Implement Weight#count for vector values in the FieldExistsQuery [lucene]

2024-06-03 Thread via GitHub
bugmakerr commented on code in PR #13322: URL: https://github.com/apache/lucene/pull/13322#discussion_r1625261750 ## lucene/CHANGES.txt: ## @@ -399,6 +399,8 @@ Optimizations * GITHUB#13406: Replace List by IntArrayList and List by LongArrayList. (Bruno Roustant) +* GIT

Re: [PR] Prevent DefaultPassageFormatter from taking shorter overlapping passages [lucene]

2024-06-03 Thread via GitHub
github-actions[bot] commented on PR #13384: URL: https://github.com/apache/lucene/pull/13384#issuecomment-2146338618 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [I] Significant drop in recall for int8 scalar quantization using maximum_inner_product [lucene]

2024-06-03 Thread via GitHub
jmazanec15 commented on issue #13350: URL: https://github.com/apache/lucene/issues/13350#issuecomment-2146125074 I see that makes sense. Thanks @benwtrent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Implement Weight#count for vector values in the FieldExistsQuery [lucene]

2024-06-03 Thread via GitHub
benwtrent commented on code in PR #13322: URL: https://github.com/apache/lucene/pull/13322#discussion_r1624971924 ## lucene/CHANGES.txt: ## @@ -399,6 +399,8 @@ Optimizations * GITHUB#13406: Replace List by IntArrayList and List by LongArrayList. (Bruno Roustant) +* GITHUB#

Re: [PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
benwtrent commented on PR #13448: URL: https://github.com/apache/lucene/pull/13448#issuecomment-2145518573 > OK ... well, it would sure be nice to expose a flat vector field so we can share the nice scalar quantization tools. Yep, a new format should be fairly simple. -- This is a

Re: [PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
msokolov commented on PR #13448: URL: https://github.com/apache/lucene/pull/13448#issuecomment-2145509626 I'm super confused - I thought that's what you had been working on! But .. I guess it was really more of an internal refactoring? OK ... well, it would sure be nice to expose a flat vec

Re: [PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
msokolov merged PR #13448: URL: https://github.com/apache/lucene/pull/13448 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
benwtrent commented on PR #13448: URL: https://github.com/apache/lucene/pull/13448#issuecomment-2145464292 @msokolov to my knowledge, there is no flat format provided by Lucene at this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
msokolov commented on PR #13448: URL: https://github.com/apache/lucene/pull/13448#issuecomment-2145416000 Note: I stumbled on this while trying to figure out how to index a vector field with no HNSW index ... I'm still not clear on how to do that ... we might need some more javadoc updates?

[PR] mention KnnVectorsFormat in o.a.l.codecs package javadocs [lucene]

2024-06-03 Thread via GitHub
msokolov opened a new pull request, #13448: URL: https://github.com/apache/lucene/pull/13448 I found a dusty corner we seem to have missed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Avoid SegmentTermsEnumFrame reload block. [lucene]

2024-06-03 Thread via GitHub
mikemccand commented on code in PR #13253: URL: https://github.com/apache/lucene/pull/13253#discussion_r1624458258 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ## @@ -434,8 +436,29 @@ public boolean seekExact(BytesRef target) throws

Re: [PR] Use `IndexInput#prefetch` for terms dictionary lookups. [lucene]

2024-06-03 Thread via GitHub
mikemccand commented on code in PR #13359: URL: https://github.com/apache/lucene/pull/13359#discussion_r1624448703 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ## @@ -307,6 +309,30 @@ private boolean setEOF() { return true; }

Re: [I] Support for criteria based DWPT selection inside DocumentWriter [lucene]

2024-06-03 Thread via GitHub
mikemccand commented on issue #13387: URL: https://github.com/apache/lucene/issues/13387#issuecomment-2145162839 I like @jpountz's idea of just using separate `IndexWriter`s for this use-case, instead of adding custom routing logic to the separate DWPTs inside a single `IndexWriter` and the

Re: [I] What does the Lucene community think about dimensionality reduction for vectors, and should it be something the library does internally (at merge time perhaps)? [lucene]

2024-06-03 Thread via GitHub
benwtrent commented on issue #13403: URL: https://github.com/apache/lucene/issues/13403#issuecomment-2144974481 > Maybe open a spinoff for this one? https://github.com/apache/lucene/issues/13447 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] What does the Lucene community think about dimensionality reduction for vectors, and should it be something the library does internally (at merge time perhaps)? [lucene]

2024-06-03 Thread via GitHub
mikemccand commented on issue #13403: URL: https://github.com/apache/lucene/issues/13403#issuecomment-2144941653 > As an aside, this "wait to build the index" thing could also be done for HNSW. Tiny segments with quick flushes probably shouldn't even build HNSW graphs. Instead, they should

Re: [PR] Add prefetching support to stored fields. [lucene]

2024-06-03 Thread via GitHub
jpountz merged PR #13424: URL: https://github.com/apache/lucene/pull/13424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Removed Scorer#getWeight [lucene]

2024-06-03 Thread via GitHub
jpountz commented on PR #13440: URL: https://github.com/apache/lucene/pull/13440#issuecomment-2144462134 > It's not clear why Solr would care with regards to FunctionValues & Weights in particular. I don't notice Solr using Weights there but maybe I'm not looking in quite the right spot?

Re: [PR] Implement Weight#count for vector values in the FieldExistsQuery [lucene]

2024-06-03 Thread via GitHub
bugmakerr commented on PR #13322: URL: https://github.com/apache/lucene/pull/13322#issuecomment-210261 @jpountz could you please take a look when you get a chance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use