[GitHub] [lucene] msokolov commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-11-20 Thread GitBox
msokolov commented on code in PR #11860: URL: https://github.com/apache/lucene/pull/11860#discussion_r1027282652 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/ExpandingVectorValues.java: ## @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] kotman12 commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
kotman12 commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321155887 Interesting .. this synchronization might not actually be doing anything after all. But I still think there are some thread-unsafe calls here regarding the integration as a whole which

[GitHub] [lucene] kotman12 commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
kotman12 commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321159282 > See background on `java-user@`: https://lists.apache.org/thread/okm7c49r7rv53z7v392v2v4pbv6m1pnw > > The `BaseTokenStreamTestCase.checkRandomData` has proven itself effective a

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321166306 I think there is no contention today due to the way the factories work. There is only one thread/tokenstream for *Op. But I am happy to add the random tests and remove the code smell? It m

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321167684 > Does this library also check for race conditions that can arise between ResourceLoaderAware::inform vs TokenStream creation and processing? I know it may be out of the scope of this chan

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321170185 > Does this library also check for race conditions that can arise between ResourceLoaderAware::inform vs TokenStream creation and processing? I know it may be out of the scope of this chan

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321171664 and looking more, I think the answer is "no there isn't any race". Calling inform() is basically like part of the initialization of the thing. The factory isn't "done" being created until

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321172760 As far as better stress-testing factories in its base class, we already stress-test using multiple tokenstreams (threads) across a single factory here. We could add stuff to the Bas

[GitHub] [lucene] kotman12 commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
kotman12 commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321174402 I will take a look at the test locally, thanks. > There is only one thread/tokenstream for *Op. I think we are talking about different things. The mutable members within *

[GitHub] [lucene] kotman12 commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
kotman12 commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321176867 > I will take a look at the test locally, thanks. > > > There is only one thread/tokenstream for *Op. > > I think we are talking about different things. The mutable members

[GitHub] [lucene] kotman12 commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
kotman12 commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321181537 > As far as better stress-testing factories in its base class, we already stress-test using multiple tokenstreams (threads) across a single factory here. > > We could add stuff to

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321189244 ok, good, I think we are on the same page. sorry, i dont know anything about opennlp and I'm not familiar with their codebase at all. -- This is an automated message from the Apache Git

[GitHub] [lucene] rmuir commented on pull request #11955: Remove synchronization from OpenNLP integration and add thread-safety tests(checkRandomData)

2022-11-20 Thread GitBox
rmuir commented on PR #11955: URL: https://github.com/apache/lucene/pull/11955#issuecomment-1321193321 and i don't mind doing the work here on the testing honestly. or we can do it in another PR. basically I think we should engage the BaseAnalysisFactoryTestCase with simple basic tests l

[GitHub] [lucene] dweiss closed issue #10709: gradle precommit sometimes fails with "IOException: stream closed" from javadoc in nightly benchmarks [LUCENE-9670]

2022-11-20 Thread GitBox
dweiss closed issue #10709: gradle precommit sometimes fails with "IOException: stream closed" from javadoc in nightly benchmarks [LUCENE-9670] URL: https://github.com/apache/lucene/issues/10709 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [lucene] dweiss commented on issue #10236: look into adding -XX:ActiveProcessorCount=1 to parallel build jvms [LUCENE-9196]

2022-11-20 Thread GitBox
dweiss commented on issue #10236: URL: https://github.com/apache/lucene/issues/10236#issuecomment-1321204164 Implemented by #11927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [lucene] dweiss closed issue #10236: look into adding -XX:ActiveProcessorCount=1 to parallel build jvms [LUCENE-9196]

2022-11-20 Thread GitBox
dweiss closed issue #10236: look into adding -XX:ActiveProcessorCount=1 to parallel build jvms [LUCENE-9196] URL: https://github.com/apache/lucene/issues/10236 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene] rmuir opened a new issue, #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
rmuir opened a new issue, #11957: URL: https://github.com/apache/lucene/issues/11957 ### Description When I run builds on mac, I see: ``` > Task :gitStatus Cannot run program "/usr/bin/git" (in directory "/usr/bin"): error=2, No such file or directory ``` But the er

[GitHub] [lucene] dweiss commented on issue #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
dweiss commented on issue #11957: URL: https://github.com/apache/lucene/issues/11957#issuecomment-1321223132 Perhaps we should upgrade jgit first? scriptDepVersions.gradle, the latest one is 6.3.0.202209071007, https://repo1.maven.org/maven2/org/eclipse/jgit/org.eclipse.jgit/6.3.0.202209071

[GitHub] [lucene] dweiss commented on issue #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
dweiss commented on issue #11957: URL: https://github.com/apache/lucene/issues/11957#issuecomment-1321226220 I looked at the source and the only reason it tries to locate system git is to find out where its config files are. It's actually configurable via: ``` /** * The e

[GitHub] [lucene] dweiss commented on issue #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
dweiss commented on issue #11957: URL: https://github.com/apache/lucene/issues/11957#issuecomment-1321229903 Yeah... it's absolutely terrible that jgit blindly forks jgit off the path. I just made the Lucene build open a file editor by renaming it to git.exe... I think we should just

[GitHub] [lucene] rmuir commented on issue #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
rmuir commented on issue #11957: URL: https://github.com/apache/lucene/issues/11957#issuecomment-1321248831 good find, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [lucene] mdmarshmallow opened a new pull request, #11958: GITHUB-11868: Add FilterIndexInput and FilterIndexOutput wrapper classes

2022-11-20 Thread GitBox
mdmarshmallow opened a new pull request, #11958: URL: https://github.com/apache/lucene/pull/11958 ### Description Added a new `FilterIndexInput` and `FilterIndexOutput` class to handle delegate `IndexOutput` and `IndexInput` classes. I also changed some of the current subclasse

[GitHub] [lucene] rmuir commented on issue #11957: false warning around 'gitStatus' on newish macos X

2022-11-20 Thread GitBox
rmuir commented on issue #11957: URL: https://github.com/apache/lucene/issues/11957#issuecomment-1321256630 i tried it out, `GIT_CONFIG_NOSYSTEM=1 ./gradlew gitStatus` no longer prints the confusing error. -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [lucene] stefanvodita commented on pull request #11815: Support deletions in rearrange (#11814)

2022-11-20 Thread GitBox
stefanvodita commented on PR #11815: URL: https://github.com/apache/lucene/pull/11815#issuecomment-1321267832 Thank you for the ideas @zhaih , I’ve incorporated all of them. Your suggestion of using `ByteRef` in the keyset led to me finding a bug in the tests. The index creation methods did

[GitHub] [lucene] zhaih commented on a diff in pull request #11815: Support deletions in rearrange (#11814)

2022-11-20 Thread GitBox
zhaih commented on code in PR #11815: URL: https://github.com/apache/lucene/pull/11815#discussion_r1027612271 ## lucene/misc/src/java/org/apache/lucene/misc/index/BinaryDocValueSelector.java: ## @@ -30,52 +31,94 @@ import org.apache.lucene.store.Directory; import org.apache.lu