[GitHub] [lucene] rendel commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

2022-11-14 Thread GitBox
rendel commented on issue #11702: URL: https://github.com/apache/lucene/issues/11702#issuecomment-1313258450 > Could you use [OpenSearch's implementation](https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/mapper/BinaryFieldMapper.java#L241)?

[GitHub] [lucene] jpountz commented on pull request #11903: Speed up sorting on unique string fields.

2022-11-14 Thread GitBox
jpountz commented on PR #11903: URL: https://github.com/apache/lucene/pull/11903#issuecomment-1313781961 @mikemccand Merging this PR will require regolding nightly benchmarks. Does it help if you can control when the PR gets merged? -- This is an automated message from the Apache Git Serv

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313830132 @uschindler I think I understand Robert's concerns against mlock, but less yours against loading. Do you think that it would be useless to preload if these files can then get paged out?

[GitHub] [lucene] jpountz commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
jpountz commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1313870436 @uschindler FYI it looks like this change made indexing ~12% faster with `BEST_SPEED` on the stored fields benchmark: http://people.apache.org/~mikemccand/lucenebench/stored_fields_benchmark

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313875966 > @uschindler I think I understand Robert's concerns against mlock, but less yours against loading. Do you think that it would be useless to preload if these files can then get paged

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313881947 > If you agree that all files marked with IOContext.LOAD are also candidates for mlock2, then I am fine. I think it would apply to the same files indeed. Let me update the P

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313886416 I think the predicate would also simplify the current Elasticserach code which instantiates 2 MMapDirectory instances with the same underlying dir and uses FileSwitchDir in front. A p

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313887463 Agreed, this would be a welcome simplification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313888642 we can keep and deprecate the old boolean setter that delegates to 2 predefined "alaways on", "always off" predicates. -- This is an automated message from the Apache Git Service. T

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313905183 @uschindler Something like that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [lucene] jpountz commented on issue #11914: Remove QueryTimeout#isTimeoutEnabled?

2022-11-14 Thread GitBox
jpountz commented on issue #11914: URL: https://github.com/apache/lucene/issues/11914#issuecomment-1313907213 @shubhamvishu Yes, my thinking was to push the responsibility to the caller and only wrap with an `ExitableDirectoryReader` if there is actually a timeout. -- This is an automated

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313914337 Looks fine. I would add 3 predefined constants: ```java public static final BiPredicate ALL_FILES = (file, context) -> true; public static final BiPredicate NO_FILES = (fi

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313922578 Unrelated problem with Github CI: > org.apache.lucene.index.TestIndexFileDeleter.testExcInDecRef this test failed for me yesterday in the same way. Seems to be a bug. --

[GitHub] [lucene] jpountz commented on pull request #11888: [Fix] Binary search the entries when all suffixes have the same length in a leaf block.

2022-11-14 Thread GitBox
jpountz commented on PR #11888: URL: https://github.com/apache/lucene/pull/11888#issuecomment-1313929513 Thanks for looking into a test for this. @mikemccand Would you have time to have a look at this? I knew it was complex, but even more than I had anticipated, so I would appreciate

[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1313949994 > @uschindler FYI it looks like this change made indexing ~12% faster with `BEST_SPEED` on the stored fields benchmark: http://people.apache.org/~mikemccand/lucenebench/stored_fields_benc

[GitHub] [lucene] jpountz commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
jpountz commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1313963721 I haven't dug, but your reasoning could explain why we are seeing a similar speedup across both `BEST_SPEED` and `BEST_COMPRESSION` since they benefit from the bulk merging optimization in a

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313965347 looks great. I don't like the `PRELOAD_` prefix on the constants, because I plan to use the same for mlock2. I was already thinking about this, but came to no good names. So my propos

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313968557 OK. I wanted to use names that give a sense of when these general `BiPredicate` objects are useful without looking up javadocs, but your point about reusing for mlock makes sense. --

[GitHub] [lucene] uschindler commented on a diff in pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on code in PR #11917: URL: https://github.com/apache/lucene/pull/11917#discussion_r1021728743 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -85,25 +85,23 @@ public class MMapDirectory extends FSDirectory { * Argument for {@lin

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313978152 Looks good to me, i think maybe say for the constant that it uses LOAD context. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313978716 ...namining is hard... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313979769 I will open an issue in Solr to adapt their code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [lucene] rmuir commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
rmuir commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1313991795 > @uschindler I think I understand Robert's concerns against mlock, but less yours against loading. Do you think that it would be useless to preload if these files can then get paged out?

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314009331 I don't care about the new behaviour, as I still ave the option to configure it. I agree with Robert, if the files should stay on heap load them to heap. What I like most on thi

[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1314033529 It does not fully explain it, because actually the copy of memory between file and heap is both using `Unsafe#copyMemory`. The only difference is that for ByteBuffer there is some hardcod

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314053441 I reverted the bits that would load some files in physical memory by default. These files don't have to be in memory and I actually like that these files are not in the heap because it m

[GitHub] [lucene] rmuir commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
rmuir commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1314053561 one issue with "fancy" copies (e.g. for bulk merging) is that we can't make them super-fancy anyway (e.g. zero-copy from fd to fd) as we still have to calculate CRC32 checksum. -- T

[GitHub] [lucene] rmuir commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
rmuir commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314072987 And 1.75GB is the same amount if you want to mlock the pages. Which no operating system is gonna let you do by default, on linux it will require modifying system soft/hard limits. I

[GitHub] [lucene] rmuir commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
rmuir commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314077245 also wanna note my concern around any use of mlock, that, users may not know how to do the right thing and be lazy and just run as root. this is sadly the stuff that users do, i'd hate to

[GitHub] [lucene] gsmiller opened a new pull request, #11928: GH#11922: Allow DisjunctionDISIApproximation to short-circuit

2022-11-14 Thread GitBox
gsmiller opened a new pull request, #11928: URL: https://github.com/apache/lucene/pull/11928 ### Description This PR explores allowing `DisjunctionDISIApproximation` to short-circuit after "finding" a doc, allowing sub-iterators to only be exhaustively advanced when necessary. This c

[GitHub] [lucene] gsmiller commented on a diff in pull request #11928: GH#11922: Allow DisjunctionDISIApproximation to short-circuit

2022-11-14 Thread GitBox
gsmiller commented on code in PR #11928: URL: https://github.com/apache/lucene/pull/11928#discussion_r1021831743 ## lucene/MIGRATE.md: ## @@ -102,6 +102,12 @@ Lucene 9.2 or stay with 9.0. See LUCENE-10558 for more details and workarounds. +### DisjunctionDISIApproximation b

[GitHub] [lucene] risdenk closed pull request #11921: errorprone WIP

2022-11-14 Thread GitBox
risdenk closed pull request #11921: errorprone WIP URL: https://github.com/apache/lucene/pull/11921 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issu

[GitHub] [lucene] gsmiller commented on issue #11922: DisjunctionDISIApproximation / DisjunctionScorer can short-circuit

2022-11-14 Thread GitBox
gsmiller commented on issue #11922: URL: https://github.com/apache/lucene/issues/11922#issuecomment-1314092165 I put up an initial PR of this idea here #11928. It can likely be further optimized for the two-phase case, but I wanted to start simple and see if there are any fundamental flaws

[GitHub] [lucene] uschindler commented on a diff in pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on code in PR #11917: URL: https://github.com/apache/lucene/pull/11917#discussion_r1021839467 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -93,15 +93,8 @@ public class MMapDirectory extends FSDirectory { */ public static f

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314111220 BTW, sorry that I mentioned mlock2... oh man oh man. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [lucene] uschindler commented on a diff in pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on code in PR #11917: URL: https://github.com/apache/lucene/pull/11917#discussion_r1021849653 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -45,8 +45,8 @@ public enum Context { public final boolean readOnce; /** - * This fla

[GitHub] [lucene] uschindler commented on a diff in pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on code in PR #11917: URL: https://github.com/apache/lucene/pull/11917#discussion_r1021850600 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -45,8 +45,8 @@ public enum Context { public final boolean readOnce; /** - * This fla

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314119766 Maybe we should start again here: - sorry for mentioning mlock2, I will think of a way in the future to hook into the MemorySegment and configure it in native ways (e.g. set madvise

[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1314122540 Yes you're right! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314122689 @dweiss I'm seeing the same behavior when i run: `/gradlew assemble -Dvalidation.errorprone=true`. I fear error-prone is somehow "stomping" on the jvm arguments. The patch does work in the

[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)

2022-11-14 Thread GitBox
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1314125432 Maybe at some point JDK allows to call `Checksum#update(MemorySegment,...)`. Then we don't need to copy multiple times, but the checksum will be calculated using the off-heap slice. --

[GitHub] [lucene] rmuir commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
rmuir commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314126519 Well it was good to mention mlock as that "makes more sense" than just "preload" (which has a confusing name IMO). "preload" just faults it in but if there is memory pressure it can then j

[GitHub] [lucene] uschindler commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
uschindler commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314131627 I think to get a "real sticky preload to heap", we would need to add a NRTCachingDirectory-like wrapper on top of any Directory that copies the file into ByteBuffersIndexInput. --

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314152039 Agreed with your suggestion to reset and split into two PRs @uschindler. I'll close this one and open two new ones. -- This is an automated message from the Apache Git Service. To resp

[GitHub] [lucene] jpountz closed pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz closed pull request #11917: Automatically preload index files that are both tiny and very hot. URL: https://github.com/apache/lucene/pull/11917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [lucene] dweiss commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
dweiss commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314153938 Damn. Let me look into it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [lucene] jpountz opened a new pull request, #11929: More granular control of preloading on MMapDirectory.

2022-11-14 Thread GitBox
jpountz opened a new pull request, #11929: URL: https://github.com/apache/lucene/pull/11929 This enables configuring preloading on `MMapDirectory` based on the file name as well as the `IOContext` that is used to open the file. -- This is an automated message from the Apache Git Servi

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314176062 i tried looking at the kotlin code but didnt see any obvious way to sneak args in there. Also not really fluent in kotlin lol -- This is an automated message from the Apache Git Service.

[GitHub] [lucene] jpountz opened a new pull request, #11930: Introduce IOContext.LOAD

2022-11-14 Thread GitBox
jpountz opened a new pull request, #11930: URL: https://github.com/apache/lucene/pull/11930 This PR introduces a new `LOAD` `IOContext` for files that are a small fraction of the total index size and are expected to be accessed in a random-access fashion. This may be leveraged by some `Dire

[GitHub] [lucene] jpountz commented on pull request #11917: Automatically preload index files that are both tiny and very hot.

2022-11-14 Thread GitBox
jpountz commented on PR #11917: URL: https://github.com/apache/lucene/pull/11917#issuecomment-1314178816 This got split into #11929 and #11930. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] dweiss commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
dweiss commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314200220 I didn't look at kotlin code, since I'm not fluent there at all... :) But I had a hunch to just upgrade the plugin thinking it's probably a bug there... and it seems it has been fixed. I

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314202127 definitely seems to be working as it is spamming my console :) Thank you! I'll test do some simple benchmarks vs main with the `task.times`, error-prone on and off, and see where we'

[GitHub] [lucene] dweiss opened a new pull request, #11931: Update github actions to v3.

2022-11-14 Thread GitBox
dweiss opened a new pull request, #11931: URL: https://github.com/apache/lucene/pull/11931 Updates github actions to v3 to avoid warnings when running actions on PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [lucene] dweiss commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
dweiss commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314209793 Please feel free to hack anything on my branch or cherry pick what you need - this was just a poc experiment. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314220207 Thanks, this definitely got me past my stumbling block with error-prone. I will do more tests and try to get it ready. as a first benchmark i tried './gradlew clean && ./gradlew chec

[GitHub] [lucene] dweiss merged pull request #11931: Update github actions to v3.

2022-11-14 Thread GitBox
dweiss merged PR #11931: URL: https://github.com/apache/lucene/pull/11931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

[GitHub] [lucene] uschindler commented on a diff in pull request #11930: Introduce IOContext.LOAD

2022-11-14 Thread GitBox
uschindler commented on code in PR #11930: URL: https://github.com/apache/lucene/pull/11930#discussion_r1021939686 ## lucene/core/src/java/org/apache/lucene/store/IOContext.java: ## @@ -41,34 +41,47 @@ public enum Context { public final FlushInfo flushInfo; + /** This fl

[GitHub] [lucene] benwtrent commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-11-14 Thread GitBox
benwtrent commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1314379458 I am digging back into this. - Vectors that are similar do not have similar doc_ids. To ensure this, some clustering would have to occur. This implies a different graph structur

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314381799 OK i sped up the non-errorprone case too. It is best not to fork at all, forking is not wanted or helpful (but when it happens, we want reasonable options set). Instead, tuning options sho

[GitHub] [lucene] mmatela commented on issue #11864: ArrayIndexOutOfBoundException

2022-11-14 Thread GitBox
mmatela commented on issue #11864: URL: https://github.com/apache/lucene/issues/11864#issuecomment-1314382093 Really doesn't look like a problem in SOLR. `QueryBuilder.analyzeGraphBoolean()` creates a `GraphTokenStreamFiniteStrings` (also Lucene class), retrieves a terms list from it and do

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314387568 note, if you want to try this branch out yourself, you need to be sure to `clear/regenerate` gradle.properties after switching branches to see the impact. -- This is an automated message

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314397587 I added what was missing to fix #11924, too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] rmuir commented on pull request #11927: pass jvm args to javac #11925

2022-11-14 Thread GitBox
rmuir commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314400528 I'd like to reduce the 3GB heap of gradle we set by default too. But let's just make that another issue/discussion as maybe it is more controversial. I've been overriding the 3GB gradle he

[GitHub] [lucene] mmatela commented on issue #11864: ArrayIndexOutOfBoundException

2022-11-14 Thread GitBox
mmatela commented on issue #11864: URL: https://github.com/apache/lucene/issues/11864#issuecomment-1314436677 In my case the exception occurred with filters `WordDelimiterGraphFilter > FlattenGraphFilter > StopFilter`, but it started to work when `StopFilter` was moved before `FlattenGraph`

[GitHub] [lucene] rmuir commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-11-14 Thread GitBox
rmuir commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1314464728 > The neighbors for a given vector have effectively random doc_ids from [0, MAX_DOC) > We read ALL neighbors at once when exploring So let's sort and delta-encode this list before

[GitHub] [lucene] rmuir commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-11-14 Thread GitBox
rmuir commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1314470208 @benwtrent I see what you are saying, but if you add delta-encoding, I feel like PFOR exception mechanism could work. Then you can patch a couple big gaps for docids but keep overall bpv l

[GitHub] [lucene] rmuir commented on pull request #11923: enable error-prone "narrow calculation" check

2022-11-14 Thread GitBox
rmuir commented on PR #11923: URL: https://github.com/apache/lucene/pull/11923#issuecomment-1314486480 > just some question about cases like `merge.estimatedMergeBytes / 1024 / 1024.` changed to `merge.estimatedMergeBytes / 1024. / 1024.`: > Were they also found by this check or did you j

[GitHub] [lucene] dweiss commented on pull request #11927: speed up javac and error-prone tasks by using less resources

2022-11-14 Thread GitBox
dweiss commented on PR #11927: URL: https://github.com/apache/lucene/pull/11927#issuecomment-1314869284 > It is best not to fork at all, forking is not wanted or helpful Forking javac is needed under a few circumstances - when you want to use a different version javac (which we supp