Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
rjernst commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1840023723 Thanks for the ping @uschindler , great idea! I opened https://github.com/elastic/elasticsearch/issues/102955 to track the same work in Elasticsearch. -- This is an automated message

Re: [PR] Allow FST builder to use different writer (#12543) [lucene]

2023-12-04 Thread via GitHub
dungba88 commented on PR #12624: URL: https://github.com/apache/lucene/pull/12624#issuecomment-1839997544 > I tested just how much slower the ByteBuffer based store is than the FST's BytesStore: I assume this is before the last iteration that does the freeze, is that right? What do

Re: [I] Jvm Crashes occassionaly with Lucene 8.10.0, JDK 11.0.15+10 [lucene]

2023-12-04 Thread via GitHub
sosohu commented on issue #12863: URL: https://github.com/apache/lucene/issues/12863#issuecomment-1839890552 Thank you very much @rmuir, @uschindler. Yes, there are many threads running at the same time and we have a managed reference count to close the reader, maybe there are some issues

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
rmuir commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839841171 some of the permissions involved are probably only safe to remove in 10.x due to #12038 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on code in PR #12873: URL: https://github.com/apache/lucene/pull/12873#discussion_r1414718350 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java: ## @@ -201,48 +226,57 @@ private static void unusedTestCompile() throws IO

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on code in PR #12873: URL: https://github.com/apache/lucene/pull/12873#discussion_r1414718350 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java: ## @@ -201,48 +226,57 @@ private static void unusedTestCompile() throws IO

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on code in PR #12873: URL: https://github.com/apache/lucene/pull/12873#discussion_r1414718350 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java: ## @@ -201,48 +226,57 @@ private static void unusedTestCompile() throws IO

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839826543 > I like the change! > > Can be a followup ticket, but i think we should remove `ClassLoader` usage from `TestCustomFunctions.java`. It doesn't need to use classloader to test

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
rmuir commented on code in PR #12873: URL: https://github.com/apache/lucene/pull/12873#discussion_r1414716973 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java: ## @@ -201,48 +226,57 @@ private static void unusedTestCompile() throws IOExcep

Re: [PR] Introduce growInRange to reduce array overallocation [lucene]

2023-12-04 Thread via GitHub
stefanvodita commented on PR #12844: URL: https://github.com/apache/lucene/pull/12844#issuecomment-1839689672 Unless anyone has other ideas, I’ll revert the changes to `NeighborArray` and only keep `growInRange` for `DirectoryTaxonomyReader`. Separately, I will go through other uses of the

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414510733 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414510733 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
jpountz commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414499427 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount) {

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
jpountz commented on PR #12868: URL: https://github.com/apache/lucene/pull/12868#issuecomment-1839471249 > Do you mean to change StringHelper class to add support for 128 bit hash because currently it creates 32-bit hash with Murmur 3? Sorry, I had overlooked that StringHelper only us

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839428422 I added documentation and migration guide. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] LUCENE-10002: Deprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager) - TopFieldCollectorManager & TopScoreDocCollectorManager [lucene]

2023-12-04 Thread via GitHub
zacharymorn commented on PR #240: URL: https://github.com/apache/lucene/pull/240#issuecomment-1839403374 > I think we would do the removal in 10, as that's the type of change that we would make for a major release. Or can decide to delay the removal if need be, but delaying the deprecation

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
dweiss commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839399108 I glanced through the code and it looks good to me. This said, it's really low level and I'm not super intimate with method handles so don't trust my +1. :) -- This is an automated me

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Add static function in TaskExecutor to retrieve the results for a collection of Future [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on PR #12798: URL: https://github.com/apache/lucene/pull/12798#issuecomment-1839348896 Makes sense to me! I'll mark this as close then. Thanks @javanna ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Add static function in TaskExecutor to retrieve the results for a collection of Future [lucene]

2023-12-04 Thread via GitHub
shubhamvishu closed pull request #12798: Add static function in TaskExecutor to retrieve the results for a collection of Future URL: https://github.com/apache/lucene/pull/12798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add static function in TaskExecutor to retrieve the results for a collection of Future [lucene]

2023-12-04 Thread via GitHub
javanna commented on PR #12798: URL: https://github.com/apache/lucene/pull/12798#issuecomment-1839288615 I don't see a lot of value in sharing this code to be honest, as there are three places that simply call future.get and catch execution exception and unwrap the actual exception. That se

Re: [PR] LUCENE-10002: Deprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager) - TopFieldCollectorManager & TopScoreDocCollectorManager [lucene]

2023-12-04 Thread via GitHub
javanna commented on PR #240: URL: https://github.com/apache/lucene/pull/240#issuecomment-1839284708 I think we would do the removal in 10, as that's the type of change that we would make for a major release. Or can decide to delay the removal if need be, but delaying the deprecation kind o

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839275419 While writing the benchmark I figured out that the Expression base class has some problems with a missing `IOException` on the `evaluate(DoubleValues[])` method. As bytecode knows no

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839265253 Benchmark was added and classloader code removed. I will add a helper function to support some compatibility with legacy code still using `java.lang.reflect.Method`, but after t

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839222928 I will also post my benchmark, as it allows to test the performance of expressions. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839221772 Hi Robert, yes we can remove that test. We may also remove other classloader related permissions (unless they are used at other places). -- This is an automated message from the Apa

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
rmuir commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839192204 I like the change! Can be a followup ticket, but i think we should remove `ClassLoader` usage from `TestCustomFunctions.java`. It doesn't need to use classloader to test anymore, si

Re: [PR] BaseTokenStreamTestCase.assertAnalyzesTo fails when Analyzer contains… [lucene]

2023-12-04 Thread via GitHub
lukas-vlcek closed pull request #12750: BaseTokenStreamTestCase.assertAnalyzesTo fails when Analyzer contains… URL: https://github.com/apache/lucene/pull/12750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] CheckIndex - Removal of some dead code [lucene]

2023-12-04 Thread via GitHub
slow-J opened a new pull request, #12876: URL: https://github.com/apache/lucene/pull/12876 When I was working on `CheckIndex` in PR: https://github.com/apache/lucene/pull/12797, Issue: https://github.com/apache/lucene/issues/11023, I came across some dead code that I wanted to clean up as

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1839092945 The latest commit changed the stack trace patching to be implemented inside the compiled expression. The original benchmarks had shown that the extra `evaluate()` method in the base c

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on code in PR #12868: URL: https://github.com/apache/lucene/pull/12868#discussion_r1414130332 ## lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java: ## @@ -150,9 +150,10 @@ private FuzzySet(FixedBitSet filter, int bloomSize, int hashCount

Re: [PR] BaseTokenStreamTestCase.assertAnalyzesTo fails when Analyzer contains… [lucene]

2023-12-04 Thread via GitHub
lukas-vlcek commented on PR #12750: URL: https://github.com/apache/lucene/pull/12750#issuecomment-1838893872 I am going to close this PR. I opened a new PR that has fix for `ReversePathHierarchyTokenizers` and `PathHierarchyTokenizers`: https://github.com/apache/lucene/pull/12875 -- Th

[PR] Fix position increment in (Reverse)PathHierarchyTokenizer [lucene]

2023-12-04 Thread via GitHub
lukas-vlcek opened a new pull request, #12875: URL: https://github.com/apache/lucene/pull/12875 ### Description Incrementing position attribute for each token in both `PathHieararchyTokenizer` and `ReversePathHieararchyTokenizer`. This change makes it possible to use both tokenizer

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-04 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1838867648 I ran the benchmark using `wikimediumall`, we can got a minor speed up on java21 with `MMapDirectory`, but no significant improvement with `NIOFSDirectory`(java17), because the `NIOFSD

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1838770783 I removed the `invokestatic` special case and now every function call is handled by a `MethodHandle` provided in a dynamic constant. P.S.: Java 17' LambdaBootsrap now does the s

Re: [PR] Rewrite JavaScriptCompiler to use modern JVM features (Java 17) [lucene]

2023-12-04 Thread via GitHub
uschindler commented on PR #12873: URL: https://github.com/apache/lucene/pull/12873#issuecomment-1838345039 Updated longer running benchmark, no changes: ``` Benchmark(js) (useMH) Mode Cnt Score Error Units ExpressionsBenchmark.

[PR] Remove stale BWC tests [lucene]

2023-12-04 Thread via GitHub
s1monw opened a new pull request, #12874: URL: https://github.com/apache/lucene/pull/12874 Both of these tests have been disabled for quiet a long time. While `TestManyPointsInOldIndex` looks indeed stale, `TestIndexWriterOnOldIndex` is not a more general test. -- This is an automated me

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on PR #12868: URL: https://github.com/apache/lucene/pull/12868#issuecomment-1838079506 I also ran the benchmarks for **`wikibigall`**. Here also we see similarly consistent **7%** improvement for `PKLookup` with `0.000` p-value. Below are the benchmark results :

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on PR #12868: URL: https://github.com/apache/lucene/pull/12868#issuecomment-1838068402 So I ran the `luceneutil` benchmarks with `-idFieldPostingsFormat BloomFilter` but it was failing as there was no delegate posting format and it wasn't able to find the right postin

Re: [PR] Try using Murmurhash 3 for bloom filters [lucene]

2023-12-04 Thread via GitHub
shubhamvishu commented on PR #12868: URL: https://github.com/apache/lucene/pull/12868#issuecomment-1838021938 > Maybe we should take advantage of this change to simplify this postings format, by no longer making the hash function configurable, removing the abstraction for hash functions, an