[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
ChrisHegarty commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555874325 Just dumping an initial round of benchmark results, etc, based on what is currently in this PR. Benchmark source (derived from Robert's) ``` davekim$ cat sr

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555877992 Would also no interstin to run Mike's benchmark (the vector part). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555892033 Hi, I rewrote the API stub extractor to use 2 passes. I also made the modules to be exported configurable. It now only extracts vectors for Java 20. The resulting Java 20 apijar

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199602161 ## gradle/testing/defaults-tests.gradle: ## @@ -119,10 +119,10 @@ allprojects { if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) { jvmAr

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199602226 ## gradle/testing/defaults-tests.gradle: ## @@ -119,10 +119,10 @@ allprojects { if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) { jvmAr

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555895724 There's something strange on my computer when running core tests with Java 20: :lucene:core:test (SUCCESS): 5730 test(s), 193 skipped The slowest tests (exceeding 500 ms) dur

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199608565 ## gradle/testing/defaults-tests.gradle: ## @@ -119,10 +119,10 @@ allprojects { if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) { jvmAr

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555906079 > There's something strange on my computer when running core tests with Java 20: > > ``` > :lucene:core:test (SUCCESS): 5730 test(s), 193 skipped > The slowest tests (exce

[GitHub] [lucene] almogtavor opened a new issue, #12318: Async Usage of Lucene Monitor through a Reactive Programming based application

2023-05-20 Thread via GitHub
almogtavor opened a new issue, #12318: URL: https://github.com/apache/lucene/issues/12318 ### Description I'd like to use Lucene Monitor in a non-blocking IO application, and I'd like to know what is the recommended way of doing that. Currently, I match queries with the `ParallelMatc

[GitHub] [lucene] rmuir commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
rmuir commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555915665 Thanks for re-benchmarking @ChrisHegarty ! It has been a few years and an older JDK version since this stuff was developed. I had in mind to do a couple more runs eventually: * try it o

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555926627 I imporved the extractor even more (it had a small bug). It now only uses one pass, it just delays writing out the class file to apijar until all visible classes were collected into a

[GitHub] [lucene] gsmiller commented on pull request #12310: #12276: rename DaciukMihovAutomatonBuilder to StringsToAutomaton

2023-05-20 Thread via GitHub
gsmiller commented on PR #12310: URL: https://github.com/apache/lucene/pull/12310#issuecomment-1555927824 @mikemccand where's the build method you're referencing? I took a pass at creating a "direct to binary" version of the Daciuk-Mihov algorithm in #12312. We could fold that into this wor

[GitHub] [lucene] gsmiller opened a new issue, #12319: DaciukMihovAutomatonBuilder#build should probably take a List instead of a Collection

2023-05-20 Thread via GitHub
gsmiller opened a new issue, #12319: URL: https://github.com/apache/lucene/issues/12319 ### Description `DaciukMihovAutomatonBuilder#build` requires sorted input but accepts a `Collection` argument. Since `Collection` generally doesn't guarantee any ordering as part of its contract,

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199625873 ## lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199625964 ## lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199626630 ## lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199630748 ## lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [lucene] uschindler commented on a diff in pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on code in PR #12311: URL: https://github.com/apache/lucene/pull/12311#discussion_r1199630782 ## lucene/core/src/java/org/apache/lucene/util/VectorUtilProvider.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555945898 I fixed the warnings to give correct instructions if incubator module is missing or the default locale f*cks up. -- This is an automated message from the Apache Git Service. To resp

[GitHub] [lucene] ChrisHegarty commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
ChrisHegarty commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555947416 @uschindler - Thanks for all the work, cleanup, log messages, etc. Looks great. Here's what I'm hoping to get to, might be tomorrow at this stage. 1. I'm in the process

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-20 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1555962105 Hi, > 1. I'm in the process of preparing a luceneutil run - still downloading on my Linux box. Then I'll try to get some comparative numbers from the benchmark - I'm not quite

[GitHub] [lucene] kwatters commented on issue #12302: vector API integration, plan B

2023-05-20 Thread via GitHub
kwatters commented on issue #12302: URL: https://github.com/apache/lucene/issues/12302#issuecomment-1556050926 For what it's worth, I did an alternative impl of vector utils using nd4j to compute cosine similarity... And shockingly, it was not any faster. I am very much in favor of having

[GitHub] [lucene] gsmiller opened a new pull request, #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-20 Thread via GitHub
gsmiller opened a new pull request, #12320: URL: https://github.com/apache/lucene/pull/12320 ### Description Adds the ability to directly build a binary automaton for a string union using the Daciuk-Mihov algorithm, and uses it to make the `TermInSetQuery#visit` implementation a litt

[GitHub] [lucene] gsmiller commented on pull request #12310: #12276: rename DaciukMihovAutomatonBuilder to StringsToAutomaton

2023-05-20 Thread via GitHub
gsmiller commented on PR #12310: URL: https://github.com/apache/lucene/pull/12310#issuecomment-1556069606 @mikemccand the `build` method you reference above in `DaciukMihovAutomatonBuilder` build an automaton with code points as transition labels, but I think we need a "compiled" binary aut

[GitHub] [lucene] gsmiller opened a new issue, #12321: Can we make `DaciukMihovAutomatonBuilder` pkg-private?

2023-05-20 Thread via GitHub
gsmiller opened a new issue, #12321: URL: https://github.com/apache/lucene/issues/12321 ### Description There's some good suggestions/discussion around renaming this class in #12310, but I wonder if we should consider making it pkg-private and exposing the `build` functionality throu

[GitHub] [lucene] gsmiller commented on pull request #12310: #12276: rename DaciukMihovAutomatonBuilder to StringsToAutomaton

2023-05-20 Thread via GitHub
gsmiller commented on PR #12310: URL: https://github.com/apache/lucene/pull/12310#issuecomment-1556069972 Separately, in terms of renaming this class, I'm still in favor of it but I also wonder if we should just consider making it pkg-private and proxying the build functionality through `Au