Re: [I] surpriseMePolygon and createRegularPolygon in test util class returns invalid polygon [lucene]

2023-11-03 Thread via GitHub
stefanvodita commented on issue #12596: URL: https://github.com/apache/lucene/issues/12596#issuecomment-1793273774 What happens with `createRegularPolygon` is very interesting. The smaller the polygon's radius is and the more vertices it has, the smaller the sides are. At some point they ge

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-03 Thread via GitHub
dungba88 commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1382304599 ## lucene/core/src/java/org/apache/lucene/util/fst/NodeHash.java: ## @@ -186,149 +218,228 @@ private long hash(FSTCompiler.UnCompiledNode node) { return h; }

Re: [I] Should we not enlarge PagedGrowableWriter initial bitPerValue on NodeHash.rehash()? [lucene]

2023-11-03 Thread via GitHub
dungba88 commented on issue #12744: URL: https://github.com/apache/lucene/issues/12744#issuecomment-1793276663 I think this should be enhancement instead of bug, but I can't edit it. @mikemccand can you help to change the label? -- This is an automated message from the Apache Git Service.

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-03 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793362865 I tweaked the FMA logic for AMD cpus, to only avoid the high-latency scalar FMA where necessary. Should appease germans to get that extra ulp or whatever. sysprops default to "auto"

Re: [PR] Make FST fully read-only and streamline FST constructor [lucene]

2023-11-04 Thread via GitHub
dungba88 closed pull request #12691: Make FST fully read-only and streamline FST constructor URL: https://github.com/apache/lucene/pull/12691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
dungba88 opened a new pull request, #12758: URL: https://github.com/apache/lucene/pull/12758 ### Description - Streamline FST constructors by grouping the medata attributes into FSTMetadata - Make it fully read-only by moving `finish()` and `setEmptyOutput` to FSTCompiler E

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on code in PR #12758: URL: https://github.com/apache/lucene/pull/12758#discussion_r1382359394 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -411,17 +401,42 @@ public FST(DataInput metaIn, DataInput in, Outputs outputs) throws IOExceptio

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793399807 Thank you @rmuir for doing all the crazy hard work to decode the actual capabilities of the bare metal hiding underneath the layers of abstraction under Panama Vector API @rmuir! I l

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
uschindler commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793421197 @rmuir: It would be nice if you could merge this long PR with Github UI and squash it - thanks. I can do it for you if you like. -- This is an automated message from the Apache Git

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793421290 I tested on my now-ancient Zen2 beast3 (nightly benchmark) box (`AMD Ryzen Threadripper 3990X 64-Core Processor`), using JDK 21 (`openjdk full version "21+35"`), with command-line `./

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1382378735 ## lucene/core/src/java/org/apache/lucene/util/fst/ByteBlockPoolReverseBytesReader.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[I] Remove `FST.BytesReader#reversed` method? [lucene]

2023-11-04 Thread via GitHub
mikemccand opened a new issue, #12759: URL: https://github.com/apache/lucene/issues/12759 ### Description Spinoff from #12738: this method seems to be dead/pointless code now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1382379017 ## lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java: ## @@ -38,6 +38,8 @@ public final class ByteBlockPool implements Accountable { /** Abstract

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1382379237 ## lucene/core/src/test/org/apache/lucene/util/TestByteBlockPool.java: ## @@ -91,6 +92,10 @@ public void testLargeRandomBlocks() throws IOException { random(

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
uschindler commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793426843 > I tested on my now-ancient Zen2 beast3 (nightly benchmark) box (`AMD Ryzen Threadripper 3990X 64-Core Processor`), using JDK 21 (`openjdk full version "21+35"`), with command-line `

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793426911 > @rmuir: It would be nice if you could follow the community standard and merge this long PR with Github UI and squash it - thanks. I can do it for you if you like. I am not done he

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on code in PR #12758: URL: https://github.com/apache/lucene/pull/12758#discussion_r1382383218 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -411,17 +401,42 @@ public FST(DataInput metaIn, DataInput in, Outputs outputs) throws IOExceptio

[I] Improve bytes copy in NodeHash [lucene]

2023-11-04 Thread via GitHub
dungba88 opened a new issue, #12760: URL: https://github.com/apache/lucene/issues/12760 ### Description Spawn of https://github.com/apache/lucene/pull/12738, there are 2 TODOs about reducing byte copies when copying from FST and when promoting from the fallback table. -- This is a

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on code in PR #12738: URL: https://github.com/apache/lucene/pull/12738#discussion_r1382384038 ## lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java: ## @@ -38,6 +38,8 @@ public final class ByteBlockPool implements Accountable { /** Abstract cl

Re: [PR] TestIndexWriterOnVMError.testUnknownError times out (fixes potential IW deadlock on tragic exceptions). [lucene]

2023-11-04 Thread via GitHub
s1monw commented on code in PR #12751: URL: https://github.com/apache/lucene/pull/12751#discussion_r1382391250 ## lucene/core/src/java/org/apache/lucene/index/IndexWriter.java: ## @@ -2560,10 +2560,15 @@ private void rollbackInternalNoCommit() throws IOException {

Re: [I] Add Scalar Quantization codec for Vectors [lucene]

2023-11-04 Thread via GitHub
benwtrent commented on issue #12497: URL: https://github.com/apache/lucene/issues/12497#issuecomment-1793446854 I have done a poor job of linking against the related work for bringing vector lossy-compression. The initial implementation of adding int8 (really, its int7 because of sig

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793456342 Benchmarks for the intel cpus. There is one place i'd fix, if we could detect sapphire rapids and avoid scalar FMA. But i have no way to detect it based on what new features it has / what

[PR] Remove usage of deprecated java.util.Locale constructor [lucene]

2023-11-04 Thread via GitHub
ChrisHegarty opened a new pull request, #12761: URL: https://github.com/apache/lucene/pull/12761 This commit removes usages of the deprecated `java.util.Locale` constructor with `Locale.Builder`. The motivation for this change is to address tech debt identified when experimenting wit

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
mikemccand merged PR #12738: URL: https://github.com/apache/lucene/pull/12738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [I] FSTCompiler's NodeHash should fully duplicate `byte[]` slices from the growing FST [lucene]

2023-11-04 Thread via GitHub
mikemccand closed issue #12714: FSTCompiler's NodeHash should fully duplicate `byte[]` slices from the growing FST URL: https://github.com/apache/lucene/issues/12714 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12738: URL: https://github.com/apache/lucene/pull/12738#issuecomment-1793476420 I merged to main, thank you @dungba88 for the fast iterations! I could barely keep up just reviewing :) After all this FST dust settles let's remember to add your CHANGES.txt e

Re: [PR] Specialize arc store for continuous label in FST [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12748: URL: https://github.com/apache/lucene/pull/12748#issuecomment-1793477236 Hello @easyice, I'm sorry but I just merged #12738 which caused conflicts here ... could you please rebase and resolve conflicts? I think this change is ready except for that. Thank

Re: [PR] Specialize arc store for continuous label in FST [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on code in PR #12748: URL: https://github.com/apache/lucene/pull/12748#discussion_r1382419312 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -96,6 +96,13 @@ public enum INPUT_TYPE { */ static final byte ARCS_FOR_DIRECT_ADDRESSING =

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12758: URL: https://github.com/apache/lucene/pull/12758#issuecomment-1793483247 > Note: We also might want to remove the constructor with FSTStore completely, and users need to call `init()` themselves? +1 -- This is an automated message from the Apache

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on code in PR #12758: URL: https://github.com/apache/lucene/pull/12758#discussion_r1382420527 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -411,17 +401,42 @@ public FST(DataInput metaIn, DataInput in, Outputs outputs) throws IOExceptio

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793484726 > @mikemccand - If you are interested, [ge.apache.org](https://ge.apache.org/scans?search.rootProjectNames=lucene-root&search.timeZoneId=America%2FChicago) is available to the Lucene proj

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793485933 > > I don't like how slow our gradle builds are, so if we can make it faster, that'd be awesome. > > Are they? What in particular is slow for you, Mike? There's tons of stuff that

Re: [PR] Make FST fully read-only and streamline FST constructor [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on PR #12691: URL: https://github.com/apache/lucene/pull/12691#issuecomment-1793486552 We closed this PR in favor of #12758? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Should we not enlarge PagedGrowableWriter initial bitPerValue on NodeHash.rehash()? [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on issue #12744: URL: https://github.com/apache/lucene/issues/12744#issuecomment-1793486887 > I think this should be enhancement instead of bug, but I can't edit it. @mikemccand can you help to change the label? Done. Annoying that GH won't let you do that, espec

Re: [I] Should we not enlarge PagedGrowableWriter initial bitPerValue on NodeHash.rehash()? [lucene]

2023-11-04 Thread via GitHub
mikemccand commented on issue #12744: URL: https://github.com/apache/lucene/issues/12744#issuecomment-1793487558 > Does that mean every values, including the ones with low-address will use the same bpv as the high-address nodes? PagedGrowableWriter already enlarge the bpv [automatically](h

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793488056 Here are the ARMs. I had to tweak ARM to use FMA more aggressively to fully utilize the gravitons. The problem there is just apple silicon, it is good we did not move forwards with benchma

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
uschindler commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793489975 > You may not like my detector, but I think it is quite practical and prevents slow execution. The detector is funny, but it won't detect slow apple silicon if you run Linux on

Re: [PR] Random access term dictionary [lucene]

2023-11-04 Thread via GitHub
nknize commented on code in PR #12688: URL: https://github.com/apache/lucene/pull/12688#discussion_r1382429884 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/lucene90/randomaccess/bitpacking/BitPacker.java: ## Review Comment: > Since they are of the same size..

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
Dawid commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793499490 TLDR; No problemo, Mickey. You can always count on me, just like last Friday when we all get wasted and I had to Uber you home. Take care! -- This is an automated message from the Apache

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793512513 > It is good that we have the sysprops to enforce FMA or disable it, overriding default detection if needed. So on apple chips with Linux you can disable it. 👻 exactly. we can't det

Re: [PR] Fix a bug in ShapeTestUtil [lucene]

2023-11-04 Thread via GitHub
stefanvodita commented on PR #12287: URL: https://github.com/apache/lucene/pull/12287#issuecomment-1793516615 The test could call the modified methods with a random `box` and assert that all vertices of the given polygon are different. We can reuse `hasIdenticalVertices` from #12757. --

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793549914 for transparency, this was my testing procedure. I did lots of other things such as poking around and running experiments too, but for the basics of "running benchmark across different ins

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793551757 and yeah, the `avx-turbo` is measuring double precision when it "benches" FMA and we do float precision, i know. but its code already written and a nice non-java way to get the wanted info

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
dweiss commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793560062 This is the problem with github at mentions, @mikemccand - whoever this is that had to drive you home, it wasn't me... -- This is an automated message from the Apache Git Service. To respo

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
dweiss commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793561529 Anyway - it's going to be difficult to saturate your CPU with tests alone, especially on a beefy machine. We limit the number of forked test JVMs - this you could tweak - but you'll soon hit

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-04 Thread via GitHub
Dawid commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793567156 > This is the problem with github at mentions, @mikemccand - whoever this is that had to drive you home, it wasn't me... Dawid, please don't treat it as problem, but as a miracle/opportu

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on code in PR #12758: URL: https://github.com/apache/lucene/pull/12758#discussion_r1382472928 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -1132,4 +1137,28 @@ public abstract static class BytesReader extends DataInput { /** Returns

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
uschindler closed pull request #12737: Speed up vectorutil float scalar methods, unroll properly, use fma where possible URL: https://github.com/apache/lucene/pull/12737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
uschindler commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793568091 Sorry, pressed wrong button. Reopened. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Make FST fully read-only and streamline FST constructor [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on PR #12691: URL: https://github.com/apache/lucene/pull/12691#issuecomment-1793570275 Yeah, this PR was originally opened for another purpose: consolidate the FSTStore and BytesStore, and that was already done. -- This is an automated message from the Apache Git Servic

Re: [PR] Streamline FST constructors and make it fully read-only [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on code in PR #12758: URL: https://github.com/apache/lucene/pull/12758#discussion_r1382474566 ## lucene/core/src/java/org/apache/lucene/util/fst/FSTCompiler.java: ## @@ -828,6 +829,26 @@ public void add(IntsRef input, T output) throws IOException { lastI

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-04 Thread via GitHub
rmuir merged PR #12737: URL: https://github.com/apache/lucene/pull/12737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] Use value-based LRU cache in NodeHash [lucene]

2023-11-04 Thread via GitHub
dungba88 commented on PR #12738: URL: https://github.com/apache/lucene/pull/12738#issuecomment-1793580112 Thank you @mikemccand ! Agree we should have a single changes entry summarizing all different PR -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Specialize arc store for continuous label in FST [lucene]

2023-11-04 Thread via GitHub
easyice commented on code in PR #12748: URL: https://github.com/apache/lucene/pull/12748#discussion_r1382516071 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -96,6 +96,13 @@ public enum INPUT_TYPE { */ static final byte ARCS_FOR_DIRECT_ADDRESSING = 1

[PR] Hide the internal data structure of HeapPointWriter [lucene]

2023-11-05 Thread via GitHub
iverase opened a new pull request, #12762: URL: https://github.com/apache/lucene/pull/12762 HeapPointWriter uses a a byte array to hold points on heap. This array is access directly from BKD radix selector to sort points in place. In addition the HeapPointReader uses the array to read the p

Re: [PR] Speed up vectorutil float scalar methods, unroll properly, use fma where possible [lucene]

2023-11-05 Thread via GitHub
uschindler commented on PR #12737: URL: https://github.com/apache/lucene/pull/12737#issuecomment-1793681356 Thanks for the hard benchmarking work! 🍻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Refactor access to VM options and move some VM options to oal.util.Constants [lucene]

2023-11-05 Thread via GitHub
uschindler commented on PR #12754: URL: https://github.com/apache/lucene/pull/12754#issuecomment-1793691267 I fixed a regression with OpenJ9 in https://github.com/apache/lucene/commit/5358b7251ec264f78a8b3c5250f5082b4756f6ca. -- This is an automated message from the Apache Git Service. To

[I] Reproducible failure in TestIndexWriter.testHasUncommittedChanges [lucene]

2023-11-05 Thread via GitHub
easyice opened a new issue, #12763: URL: https://github.com/apache/lucene/issues/12763 ### Description The failure looks related to PR: https://github.com/apache/lucene/pull/12549 ``` > java.lang.AssertionError > at __randomizedtesting.SeedInfo.seed([35A9341

Re: [PR] Specialize arc store for continuous label in FST [lucene]

2023-11-05 Thread via GitHub
easyice commented on code in PR #12748: URL: https://github.com/apache/lucene/pull/12748#discussion_r1382560887 ## lucene/core/src/java/org/apache/lucene/util/fst/FST.java: ## @@ -96,6 +96,13 @@ public enum INPUT_TYPE { */ static final byte ARCS_FOR_DIRECT_ADDRESSING = 1

Re: [I] Use max BPV encoding in postings if doc buffer size less than ForUtil.BLOCK_SIZE [lucene]

2023-11-05 Thread via GitHub
easyice commented on issue #12717: URL: https://github.com/apache/lucene/issues/12717#issuecomment-1793763101 I reproduced this using low cardinality fields, for instance we let the posing size be 100, write 10 million docs then force merge to single segment, use `TermInSetQuery` with 512 t

[PR] Detect J9 and disable vectorization completely (it is not supported there) [lucene]

2023-11-05 Thread via GitHub
rmuir opened a new pull request, #12764: URL: https://github.com/apache/lucene/pull/12764 J9 VM doesn't seem to actually implement the vector api, so it falls back to hundreds-of-times-slower-pure-java impl. Currently: ``` VectorUtilBenchmark.floatCosineScalar 1024 thrpt 15

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-05 Thread via GitHub
dweiss commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1793803122 > we are both named Dawid, we are both from Poznan, we both bike, we are both in our 40-ties. You are located at Bóżnicza Street (or at least your wife is), What are the odds, eh? Fai

Re: [PR] Detect J9 and disable vectorization completely (it is not supported there) [lucene]

2023-11-05 Thread via GitHub
uschindler commented on PR #12764: URL: https://github.com/apache/lucene/pull/12764#issuecomment-1793804533 I have another PR already developed: #12765 It does not use sysprops and detects hotspot by the algorithms we already use. Should I close this one? -- This is an automated m

Re: [PR] TestIndexWriterOnVMError.testUnknownError times out (fixes potential IW deadlock on tragic exceptions). [lucene]

2023-11-05 Thread via GitHub
dweiss commented on code in PR #12751: URL: https://github.com/apache/lucene/pull/12751#discussion_r1382620003 ## lucene/core/src/java/org/apache/lucene/index/IndexWriter.java: ## @@ -2560,10 +2560,15 @@ private void rollbackInternalNoCommit() throws IOException {

Re: [PR] Limit vectorization API to Hotspot VMs (and rename some constants and fix Javadocs) [lucene]

2023-11-05 Thread via GitHub
uschindler commented on code in PR #12765: URL: https://github.com/apache/lucene/pull/12765#discussion_r1382621462 ## lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java: ## @@ -111,6 +111,12 @@ static VectorizationProvider lookup(boolean tes

Re: [PR] Detect J9 and disable vectorization completely (it is not supported there) [lucene]

2023-11-05 Thread via GitHub
rmuir closed pull request #12764: Detect J9 and disable vectorization completely (it is not supported there) URL: https://github.com/apache/lucene/pull/12764 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Detect J9 and disable vectorization completely (it is not supported there) [lucene]

2023-11-05 Thread via GitHub
rmuir commented on PR #12764: URL: https://github.com/apache/lucene/pull/12764#issuecomment-1793808077 I like Uwe's PR better here, closing this one -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Limit vectorization API to Hotspot VMs (and rename some constants and fix Javadocs) [lucene]

2023-11-05 Thread via GitHub
uschindler merged PR #12765: URL: https://github.com/apache/lucene/pull/12765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [I] Reproducible failure in TestIndexWriter.testHasUncommittedChanges [lucene]

2023-11-05 Thread via GitHub
dweiss commented on issue #12763: URL: https://github.com/apache/lucene/issues/12763#issuecomment-1793810448 Here's another seed where it's failing (from a PR): ``` gradlew :lucene:core:test --tests "org.apache.lucene.index.TestIndexWriter.testHasUncommittedChanges" -Ptests.jvms=2 -Pt

[PR] disable vectors (and don't warn to add incubator module) for jvmci/graal [lucene]

2023-11-05 Thread via GitHub
rmuir opened a new pull request, #12766: URL: https://github.com/apache/lucene/pull/12766 Another performance trap. I see use of this stuff a lot in the wild, lots of users/apps doing native image stuff, but we don't want to use vector api here, we should definitely not be encouraging the u

Re: [PR] disable vectors (and don't warn to add incubator module) for jvmci/graal [lucene]

2023-11-05 Thread via GitHub
uschindler commented on code in PR #12766: URL: https://github.com/apache/lucene/pull/12766#discussion_r1382646072 ## lucene/core/src/java/org/apache/lucene/util/Constants.java: ## @@ -66,6 +66,10 @@ private Constants() {} // can't construct /** True iff the Java VM is based

Re: [PR] disable vectors (and don't warn to add incubator module) for jvmci/graal [lucene]

2023-11-05 Thread via GitHub
rmuir commented on PR #12766: URL: https://github.com/apache/lucene/pull/12766#issuecomment-1793846009 I don't want to detect every possible option that could slow thing down, instead ordinary configurations. Look at all the stuff being "sold" on graalvm.org: lower resource usage, faster st

Re: [PR] disable vectors (and don't warn to add incubator module) for jvmci/graal [lucene]

2023-11-05 Thread via GitHub
uschindler commented on code in PR #12766: URL: https://github.com/apache/lucene/pull/12766#discussion_r1382647154 ## lucene/core/src/java/org/apache/lucene/util/Constants.java: ## @@ -66,6 +66,10 @@ private Constants() {} // can't construct /** True iff the Java VM is based

Re: [PR] Remove usage of deprecated java.util.Locale constructor [lucene]

2023-11-05 Thread via GitHub
rmuir commented on PR #12761: URL: https://github.com/apache/lucene/pull/12761#issuecomment-1793871445 > Looks ok to me. > > Did you check all possible benchmark config/ALG files (not all of them are tested) that the locales in them are correctly for usage as language tag? ```

Re: [PR] Skip docs with Docvalues in NumericLeafComparator [lucene]

2023-11-05 Thread via GitHub
LuXugang commented on PR #12405: URL: https://github.com/apache/lucene/pull/12405#issuecomment-1793958700 Sure thing @jpountz , I would work on this in the next few days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] speedup arm int functions? [lucene]

2023-11-05 Thread via GitHub
rmuir commented on PR #12743: URL: https://github.com/apache/lucene/pull/12743#issuecomment-1794112273 When looking at `SDOT` to do this, i was able to accomplish it with another vector API, just as basis for comparison: https://godbolt.org/z/9cv5WaGaT Even if you lower `-marc

[PR] Enable executing using NFA in RegexpQuery [lucene]

2023-11-05 Thread via GitHub
zhaih opened a new pull request, #12767: URL: https://github.com/apache/lucene/pull/12767 ### Description As title, added a new flag in `RegexpQuery`'s ctor, not sure whether there's a better way, maybe using a static method instead? Like `RegexpQuery.createWithDFA`/`RegexpQu

Re: [PR] Use growNoCopy for SortingStoredFieldsConsumer#NO_COMPRESSION [lucene]

2023-11-05 Thread via GitHub
gf2121 merged PR #12733: URL: https://github.com/apache/lucene/pull/12733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Remove usage of deprecated java.util.Locale constructor [lucene]

2023-11-06 Thread via GitHub
ChrisHegarty merged PR #12761: URL: https://github.com/apache/lucene/pull/12761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Clean up ByteBlockPool [lucene]

2023-11-06 Thread via GitHub
stefanvodita commented on PR #12506: URL: https://github.com/apache/lucene/pull/12506#issuecomment-1794405194 @mikemccand @iverase - what do you think, is this PR ready? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] Add TaxonomyReader#getBulkOrdinals method (#12180) [lucene]

2023-11-06 Thread via GitHub
epotyom opened a new pull request, #12769: URL: https://github.com/apache/lucene/pull/12769 Add TaxonomyReader#getBulkOrdinals method (#12180) This is the first step for #12180 , next step will be to implement `Facets#getSpecificValues` (bulk) that calls `getBulkOrdinals`, will do it

Re: [PR] Remove patching for doc blocks. [lucene]

2023-11-06 Thread via GitHub
mikemccand commented on code in PR #12741: URL: https://github.com/apache/lucene/pull/12741#discussion_r1383233272 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99PostingsFormat.java: ## @@ -0,0 +1,518 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Enable executing using NFA in RegexpQuery [lucene]

2023-11-06 Thread via GitHub
rmuir commented on PR #12767: URL: https://github.com/apache/lucene/pull/12767#issuecomment-1794689085 There should be tests exercising this new boolean option. How do we know it gives correct results to do this? -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Add TaxonomyReader#getBulkOrdinals method (#12180) [lucene]

2023-11-06 Thread via GitHub
mikemccand commented on code in PR #12769: URL: https://github.com/apache/lucene/pull/12769#discussion_r1383241368 ## lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java: ## @@ -312,6 +316,111 @@ public int getOrdinal(FacetLabel cp) thro

Re: [PR] LUCENE-10195: Improve Gradle build speed [lucene]

2023-11-06 Thread via GitHub
mikemccand commented on PR #414: URL: https://github.com/apache/lucene/pull/414#issuecomment-1794726597 > That'a crazy Mike what you did here. ;) LOL!! This is too crazy. I admit to being a little too confident in GitHub's autosuggest when I `@` someone. I'm not sure why on t

[PR] Only explore neighbor-of-neighbor if similarity is better [lucene]

2023-11-06 Thread via GitHub
benwtrent opened a new pull request, #12770: URL: https://github.com/apache/lucene/pull/12770 I noticed while testing lower dimensionality and quantization, we would explore the HNSW graph way too much. I was stuck figuring out why until I noticed the searcher checks for distance equality (

Re: [PR] Enable executing using NFA in RegexpQuery [lucene]

2023-11-06 Thread via GitHub
rmuir commented on PR #12767: URL: https://github.com/apache/lucene/pull/12767#issuecomment-1794787497 I was thinking just to exercise it in TestRegexpRandom2 or similar test. Maybe add TestRegexpRandom3.java? Test is conceptually very simple but powerful. It has a lot of lines of cod

Re: [PR] Only explore neighbor-of-neighbor if similarity is better [lucene]

2023-11-06 Thread via GitHub
benwtrent commented on code in PR #12770: URL: https://github.com/apache/lucene/pull/12770#discussion_r1383301063 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -174,8 +174,7 @@ private int[] findBestEntryPoint(RandomVectorScorer scorer, HnswG

Re: [PR] speedup arm int functions? [lucene]

2023-11-06 Thread via GitHub
rmuir commented on PR #12743: URL: https://github.com/apache/lucene/pull/12743#issuecomment-1794797146 we should test more. I can test on more machines later too. Maybe good motivation to get full automation going for doing that. -- This is an automated message from the Apache Git Service

Re: [PR] Clean up ByteBlockPool [lucene]

2023-11-06 Thread via GitHub
mikemccand merged PR #12506: URL: https://github.com/apache/lucene/pull/12506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [I] ByteBlockPool's documentation is completely useless [LUCENE-5613] [lucene]

2023-11-06 Thread via GitHub
mikemccand closed issue #6675: ByteBlockPool's documentation is completely useless [LUCENE-5613] URL: https://github.com/apache/lucene/issues/6675 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[I] Should we ban Random#nextInt(int, int)? [lucene]

2023-11-06 Thread via GitHub
mikemccand opened a new issue, #12771: URL: https://github.com/apache/lucene/issues/12771 ### Description Spinoff from backporting #12506 which was using the [Random#nextInt(int, int) method](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/random/RandomGenerato

Re: [PR] move CSVUtil to common from analyzer nori and kuromoji [lucene]

2023-11-06 Thread via GitHub
twosom commented on PR #12390: URL: https://github.com/apache/lucene/pull/12390#issuecomment-1794844758 > I merged to main but there are quite a few conflicts on backport to 9.x -- any chance you could open a backport PR @twosom? Thanks! OK! I'll open backport PR ASAP! Thanks! -- T

Re: [I] Should we ban Random#nextInt(int, int)? [lucene]

2023-11-06 Thread via GitHub
mikemccand commented on issue #12771: URL: https://github.com/apache/lucene/issues/12771#issuecomment-1794849299 @rmuir pointed out `nextLong(long, long)` has the same warning. I'll open a PR to ban both. -- This is an automated message from the Apache Git Service. To respond to the mess

[PR] #12271: ban possibly slow Random methods [lucene]

2023-11-06 Thread via GitHub
mikemccand opened a new pull request, #12772: URL: https://github.com/apache/lucene/pull/12772 Just ban these possibly slow Random APIs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] #12271: ban possibly slow Random methods [lucene]

2023-11-06 Thread via GitHub
rmuir commented on code in PR #12772: URL: https://github.com/apache/lucene/pull/12772#discussion_r1383365289 ## gradle/validation/forbidden-apis/defaults.tests.txt: ## @@ -21,3 +21,7 @@ java.lang.System#currentTimeMillis() @ Don't depend on wall clock times #java.lang.System#

[I] Resolve Duplicate CSVUtil Classes in Nori and Kuromoji Analyzers (Backport) [lucene]

2023-11-06 Thread via GitHub
twosom opened a new issue, #12773: URL: https://github.com/apache/lucene/issues/12773 ### Description ### Description As seen in the previous issue https://github.com/apache/lucene/issues/12389 and https://github.com/apache/lucene/issues/12389, I have worked on moving the `CSV

Re: [PR] #12271: ban possibly slow Random methods [lucene]

2023-11-06 Thread via GitHub
rmuir commented on code in PR #12772: URL: https://github.com/apache/lucene/pull/12772#discussion_r1383388421 ## gradle/validation/forbidden-apis/defaults.tests.txt: ## @@ -21,3 +21,7 @@ java.lang.System#currentTimeMillis() @ Don't depend on wall clock times #java.lang.System#

Re: [PR] #12271: ban possibly slow Random methods [lucene]

2023-11-06 Thread via GitHub
mikemccand commented on PR #12772: URL: https://github.com/apache/lucene/pull/12772#issuecomment-1794914865 Woops -- I messed up the actual forbidden APIs -- working on correct fix (we do use this API in many places!). -- This is an automated message from the Apache Git Service. To respon

[PR] [Backport] Refactor CSVUtil to common analysis package [lucene]

2023-11-06 Thread via GitHub
twosom opened a new pull request, #12774: URL: https://github.com/apache/lucene/pull/12774 ### Description This PR moves the CSVUtil class that existed separately in the Analyzer Nori and Kuromoji to the module :analysis:common. By consolidating the CSVUtil class, we can prevent dupl

<    17   18   19   20   21   22   23   24   25   26   >