Re: [PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
uschindler commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1584223935 ## lucene/core/src/java/org/apache/lucene/codecs/TermStats.java: ## @@ -16,24 +16,10 @@ */ package org.apache.lucene.codecs; -import org.apache.lucene.index.Te

Re: [PR] Add test for parsing brackets in range queries [lucene]

2024-04-29 Thread via GitHub
dweiss commented on PR #13323: URL: https://github.com/apache/lucene/pull/13323#issuecomment-2084510167 The joys of escaping... Never easy (hello, Windows command-line users...). You may want to add a randomized test that constructs those terms using a mix of characters and "allowed" escape

Re: [PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
uschindler commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1584221042 ## lucene/core/src/java/org/apache/lucene/codecs/TermStats.java: ## @@ -16,24 +16,10 @@ */ package org.apache.lucene.codecs; -import org.apache.lucene.index.Te

Re: [PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
uschindler commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1584211054 ## lucene/core/src/java/org/apache/lucene/codecs/TermStats.java: ## @@ -16,24 +16,10 @@ */ package org.apache.lucene.codecs; -import org.apache.lucene.index.Te

Re: [PR] Port over gradle setting generator from Solr [lucene]

2024-04-29 Thread via GitHub
dweiss commented on PR #12131: URL: https://github.com/apache/lucene/pull/12131#issuecomment-2084491642 > I'm happy to report that we can put individual settings in ~/.gradle/gradle.properties and these settings will overlay in priority over those in the project's properties. Nice; ehh?

Re: [PR] Fix numDeletesToMerge for unchanged segments [lucene]

2024-04-29 Thread via GitHub
dnhatn commented on code in PR #13324: URL: https://github.com/apache/lucene/pull/13324#discussion_r1584139073 ## lucene/core/src/java/org/apache/lucene/index/IndexWriter.java: ## @@ -6128,7 +6128,7 @@ public final int numDeletesToMerge(SegmentCommitInfo info) throws IOExceptio

Re: [I] Support for building materialized views using Lucene formats [lucene]

2024-04-29 Thread via GitHub
bharath-techie commented on issue #13188: URL: https://github.com/apache/lucene/issues/13188#issuecomment-2084365307 Thanks for the inputs @msokolov . I do see the similarities but the linked issue seems to be tied to rollups done as part of merge aided by index sorting on the dimensions. I

Re: [PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
shubhamvishu commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1584056472 ## lucene/core/src/java/org/apache/lucene/codecs/TermStats.java: ## @@ -16,24 +16,10 @@ */ package org.apache.lucene.codecs; -import org.apache.lucene.index.

Re: [PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
shubhamvishu commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1584056472 ## lucene/core/src/java/org/apache/lucene/codecs/TermStats.java: ## @@ -16,24 +16,10 @@ */ package org.apache.lucene.codecs; -import org.apache.lucene.index.

Re: [PR] Port over gradle setting generator from Solr [lucene]

2024-04-29 Thread via GitHub
dsmiley commented on PR #12131: URL: https://github.com/apache/lucene/pull/12131#issuecomment-2084240048 > if your file is in git, adding it to .gitignore doesn't make local changes ignorable - it's still versioned and git status would show it as modified. True; I had thought .gitigno

[PR] Convert more classes to record classes [lucene]

2024-04-29 Thread via GitHub
shubhamvishu opened a new pull request, #13328: URL: https://github.com/apache/lucene/pull/13328 ### Description - This PR addresses #13207 to convert more classes on `main` branch to record classes on main (Lucene 10 only). - It moves a lot of data classes(120 to be precise) to us

Re: [PR] Terminate automaton after matched the whole prefix for PrefixQuery. [lucene]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #13072: URL: https://github.com/apache/lucene/pull/13072#issuecomment-2083913815 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Fix numDeletesToMerge for unchanged segments [lucene]

2024-04-29 Thread via GitHub
easyice commented on code in PR #13324: URL: https://github.com/apache/lucene/pull/13324#discussion_r1583918672 ## lucene/core/src/java/org/apache/lucene/index/IndexWriter.java: ## @@ -6128,7 +6128,7 @@ public final int numDeletesToMerge(SegmentCommitInfo info) throws IOExcepti

Re: [I] HnwsGraph creates disconnected components [lucene]

2024-04-29 Thread via GitHub
CloudMarc commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-2083746226 Re "known mitigations" - I simply meant common HNSW/ANN configuration changes like increasing MaxConnections and BeamWidth. Thanks for mentioning precedence for the two-pa

Re: [PR] Add test for parsing brackets in range queries [lucene]

2024-04-29 Thread via GitHub
benchaplin commented on PR #13323: URL: https://github.com/apache/lucene/pull/13323#issuecomment-2083680863 I was thinking a fix like this could work: ```java | ``` simply allowing the range term to continue parsing through an escaped space or closing bracket. "Br

Re: [I] HnwsGraph creates disconnected components [lucene]

2024-04-29 Thread via GitHub
benwtrent commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-2083624216 @msokolov I am not sure what particular mitigations @CloudMarc is talking about, but I know of a couple of options outlined here: https://github.com/apache/lucene/issues/12627#iss

Re: [I] HnwsGraph creates disconnected components [lucene]

2024-04-29 Thread via GitHub
msokolov commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-2083543364 I'm not aware of any Lucene-specific issue here. We see this mostly in unit tests, but it has also been replicated with production data. Although one can speculate this might be mo

Re: [I] HnwsGraph creates disconnected components [lucene]

2024-04-29 Thread via GitHub
CloudMarc commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-2083388627 Is there a clean separation between: 1. HNSW ANN (e.g. clean room implementation) is known to have issues with unreachable nodes for certain configurations. and 2. The L

Re: [I] Indexing all zero vectors leads to heat death of the universe [LUCENE-10590] [lucene]

2024-04-29 Thread via GitHub
msokolov commented on issue #11626: URL: https://github.com/apache/lucene/issues/11626#issuecomment-2083368210 Yay! Thank you, entropy has been defeated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput [lucene]

2024-04-29 Thread via GitHub
ChrisHegarty commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2083195761 Thanks for your quick reply @uschindler. Your analysis makes sense to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Performance improvements to use read lock to access LRUQueryCache [lucene]

2024-04-29 Thread via GitHub
benwtrent commented on code in PR #13306: URL: https://github.com/apache/lucene/pull/13306#discussion_r1583351253 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -628,7 +632,7 @@ private class LeafCache implements Accountable { LeafCache(Object

Re: [PR] Performance improvements to use read lock to access LRUQueryCache [lucene]

2024-04-29 Thread via GitHub
benwtrent commented on code in PR #13306: URL: https://github.com/apache/lucene/pull/13306#discussion_r1583344572 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -265,7 +269,6 @@ boolean requiresEviction() { } CacheAndCount get(Query key, Index

Re: [PR] Fix numDeletesToMerge for unchanged segments [lucene]

2024-04-29 Thread via GitHub
jpountz commented on code in PR #13324: URL: https://github.com/apache/lucene/pull/13324#discussion_r1583328423 ## lucene/core/src/java/org/apache/lucene/index/IndexWriter.java: ## @@ -6128,7 +6128,7 @@ public final int numDeletesToMerge(SegmentCommitInfo info) throws IOExcepti

Re: [I] Indexing all zero vectors leads to heat death of the universe [LUCENE-10590] [lucene]

2024-04-29 Thread via GitHub
benwtrent closed issue #11626: Indexing all zero vectors leads to heat death of the universe [LUCENE-10590] URL: https://github.com/apache/lucene/issues/11626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput [lucene]

2024-04-29 Thread via GitHub
uschindler commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2083062574 So this issue is not in Lucene. The slowdown comes from incorrect use of Lucene. I am waiting for the confirmation by them. -- This is an automated message from the Apache Git

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput [lucene]

2024-04-29 Thread via GitHub
uschindler commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2083061482 The issue is analyzed already: The benchmark is wrong. It closes and opens indexes all the time. Due to the additional safety applied on closing indexes (to not crush the JVM wit

Re: [I] Support for building materialized views using Lucene formats [lucene]

2024-04-29 Thread via GitHub
msokolov commented on issue #13188: URL: https://github.com/apache/lucene/issues/13188#issuecomment-2083019231 This reminded me of an older issue: https://github.com/apache/lucene/issues/11463 that seems to have foundered. Maybe there is something to be learned from that, not sure. -- Th

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput [lucene]

2024-04-29 Thread via GitHub
uschindler commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2083005280 They say that it only happens on high concurrency. In addition: Does the benchmark use stored fields at all for measuring query performance? -- This is an automated message fro

Re: [I] Examine performance of individual data accessor methods of MemorySegmentIndexInput [lucene]

2024-04-29 Thread via GitHub
mikemccand commented on issue #13325: URL: https://github.com/apache/lucene/issues/13325#issuecomment-2082853451 I've also opened https://github.com/mikemccand/luceneutil/issues/267 to understand why our nightly benchmarks didn't notice this. @uschindler maybe you have an idea! -- This

Re: [PR] Reduce memory usage of field maps in FieldInfos and BlockTree TermsReader. [lucene]

2024-04-29 Thread via GitHub
bruno-roustant commented on code in PR #13327: URL: https://github.com/apache/lucene/pull/13327#discussion_r1583136349 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsReader.java: ## @@ -113,7 +114,8 @@ public final class Lucene90BlockTr

Re: [PR] Improve int4 compressed comparisons performance [lucene]

2024-04-29 Thread via GitHub
ChrisHegarty commented on code in PR #13321: URL: https://github.com/apache/lucene/pull/13321#discussion_r1582893750 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -390,22 +392,202 @@ private int dotProductBody128(byte[] a,

[I] TestHnswBitVectorsFormat.testIndexAndSearchBitVectors fails intermittently [lucene]

2024-04-29 Thread via GitHub
ChrisHegarty opened a new issue, #13326: URL: https://github.com/apache/lucene/issues/13326 ``` $ ./gradlew :lucene:codecs:test --rerun --tests org.apache.lucene.codecs.bitvectors.TestHnswBitVectorsFormat.testIndexAndSearchBitVectors -Dtests.seed=E7CAA0C775B2FE00 ``` ```

Re: [I] Support for building materialized views using Lucene formats [lucene]

2024-04-29 Thread via GitHub
bharath-techie commented on issue #13188: URL: https://github.com/apache/lucene/issues/13188#issuecomment-2082180161 Hi @jpountz , Good question, if we take `StarTreeDataCube` as an example implementation of the above format : We will traverse the `StarTree` and `StarTreeDocValues`

Re: [PR] Terminate automaton after matched the whole prefix for PrefixQuery. [lucene]

2024-04-29 Thread via GitHub
vsop-479 commented on code in PR #13072: URL: https://github.com/apache/lucene/pull/13072#discussion_r1565442746 ## lucene/core/src/java/org/apache/lucene/util/automaton/RunAutomaton.java: ## @@ -96,6 +101,35 @@ protected RunAutomaton(Automaton a, int alphabetSize) { } }