[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238934086 Thx for reviewing and helping tuning the iterations @rmuir ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [lucene] jtibshirani commented on pull request #11756: LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively

2022-09-06 Thread GitBox
jtibshirani commented on PR #11756: URL: https://github.com/apache/lucene/pull/11756#issuecomment-1238843697 This cuts down on the API surface area for LeafReader results in less logic. I think what we tried to do initially was to push the exhaustive search down to the codec level, so that

[GitHub] [lucene] jtibshirani opened a new pull request, #11756: LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively

2022-09-06 Thread GitBox
jtibshirani opened a new pull request, #11756: URL: https://github.com/apache/lucene/pull/11756 This PR removes the recently added function on LeafReader to exhaustively search through vectors, plus the helper function KnnVectorsReader#searchExhaustively. Instead it performs the exact s

[GitHub] [lucene-solr] risdenk opened a new pull request, #2669: SOLR-16324: Upgrade commons-configuration2 to 2.8.0 and commons-text to 1.9

2022-09-06 Thread GitBox
risdenk opened a new pull request, #2669: URL: https://github.com/apache/lucene-solr/pull/2669 Backport of https://issues.apache.org/jira/browse/SOLR-16324 to branch_8_11 * Upgrade commons-configuration2 to 2.8.0 due to CVE-2022-33980 * Upgrade commons-text to 1.9 since it gets upgr

[GitHub] [lucene] mayya-sharipova commented on pull request #11743: LUCENE-10592 Better estimate memory for HNSW graph

2022-09-06 Thread GitBox
mayya-sharipova commented on PR #11743: URL: https://github.com/apache/lucene/pull/11743#issuecomment-1238777470 @msokolov Thanks for your feedback, and good ideas, we can experiment with them. We've discussed within our team (including @jpountz and @jtibshirani) and decided that w

[GitHub] [lucene] rmuir opened a new issue, #11755: TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull failure

2022-09-06 Thread GitBox
rmuir opened a new issue, #11755: URL: https://github.com/apache/lucene/issues/11755 ### Description From jenkins: ``` > Task :lucene:core:test org.apache.lucene.index.TestIndexWriterOnDiskFull > testAddDocumentOnDiskFull FAILED java.lang.IllegalStateException: thi

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238699562 Thank you for the work here on these tests @matriv ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene] rmuir closed issue #11459: Remove uses of wall-clock time in codebase [LUCENE-10423]

2022-09-06 Thread GitBox
rmuir closed issue #11459: Remove uses of wall-clock time in codebase [LUCENE-10423] URL: https://github.com/apache/lucene/issues/11459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [lucene] rmuir merged pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir merged PR #11749: URL: https://github.com/apache/lucene/pull/11749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238691660 Here's my numbers for comparison. I think we are ok here for nightly: ``` > Task :lucene:core:wipeTaskTemp The slowest tests (exceeding 500 ms) during this run: 2560.31s Test2B

[GitHub] [lucene] zhaih commented on pull request #1068: LUCENE-10674: Update subiterators when BitSetConjDISI exhausts

2022-09-06 Thread GitBox
zhaih commented on PR #1068: URL: https://github.com/apache/lucene/pull/1068#issuecomment-1238647657 Thanks, looks reasonable to me, please add an entry to CHANGES.txt! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238611322 With the last commit, locally I get: ``` :lucene:core:test (SUCCESS): 5584 test(s), 60 skipped The slowest tests (exceeding 500 ms) during this run: 865.31s Test2BPostings.tes

[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238583388 Thank you, I'm also running it locally with `-Dtests.nightly=true` and I'm going to significantly decrease those limits. i.e. 3 -> 1000 or even less. Once I have something that l

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238540078 The `-Dtests.nightly` run has some tests that timeout (theres a 7200 second hard limit to a test class) or take hours. At a glance, most of the trouble seems to be subclasses of the `Threa

[GitHub] [lucene] gsmiller commented on issue #11547: IntersectIterators is not necessary under matchAll case in Facet [LUCENE-10511]

2022-09-06 Thread GitBox
gsmiller commented on issue #11547: URL: https://github.com/apache/lucene/issues/11547#issuecomment-1238537126 Interesting observation @LuXugang. We could only "know" ahead of time that multiple iterators are "the same" though when they're effectively "all" docs in a segment right? I can't

[GitHub] [lucene] jpountz commented on issue #11696: precompute the max level in LogMergePolicy [LUCENE-10660]

2022-09-06 Thread GitBox
jpountz commented on issue #11696: URL: https://github.com/apache/lucene/issues/11696#issuecomment-1238341620 I just checked, it's here: https://github.com/apache/lucene/commit/3f6dbe8b55eb48066a800f706c371fd6f56afab0 -- This is an automated message from the Apache Git Service. To respond

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238331021 This build looks good: ``` The slowest tests (exceeding 500 ms) during this run: 6.93s TestStressIndexing.testStressIndexAndSearching (:lucene:core) 6.84s TestTessellator.t

[GitHub] [lucene] msokolov commented on pull request #11743: LUCENE-10592 Better estimate memory for HNSW graph

2022-09-06 Thread GitBox
msokolov commented on PR #11743: URL: https://github.com/apache/lucene/pull/11743#issuecomment-1238330802 OK, thanks for the reminder of the arguments for moving the graph creation to index time. > May be, we can come up with some sophisticated solution, writing vector values in batc

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238304029 @thanks matriv ! We can inspect the log again for the next build. Some of these tests use threads, so they will run a lot slower in CI or small computers like my 2-core machi

[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238296959 thx! pushed a toned down version, let's see. (my machine is quite poweful and they didn't show up in the slow tests) -- This is an automated message from the Apache Git Service. To resp

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238285805 > Mentioned, also, so that we can keep an eye on those tests, in case, some turn out to be rather slow, maybe some further tuning of the max iterations is needed. I see two suspicio

[GitHub] [lucene] rmuir opened a new issue, #11754: TestBoolean2.testRandomQueries fails in CI due to eating up heap space

2022-09-06 Thread GitBox
rmuir opened a new issue, #11754: URL: https://github.com/apache/lucene/issues/11754 ### Description Jenkins failure (sorry I don't have full build log, but the failure is reproducible, see below): ``` [JENKINS] Lucene-main-Linux (64bit/jdk-17.0.3) - Build # 36812 - Unstable

[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238258581 Mentioned, also, so that we can keep an eye on those tests, in case some turn out to be rather slow, maybe some further tuning of the max iterations is needed. -- This is an automated

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238256051 thanks @matriv for tuning the iterations. Much better than a random guess, which could lead to timeouts in CI servers, etc. I restarted the tests here. -- This is an automated message fr

[GitHub] [lucene] mayya-sharipova commented on pull request #11743: LUCENE-10592 Better estimate memory for HNSW graph

2022-09-06 Thread GitBox
mayya-sharipova commented on PR #11743: URL: https://github.com/apache/lucene/pull/11743#issuecomment-1238251543 @jpountz > could we buffer vectors on disk with the approach of building the graph during indexing? We explored this, but could not find any way to do that. To build

[GitHub] [lucene] matriv commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238246924 Thx @rmuir! Pushed the fix. Also I forgot to mention that for every replacement of time based by counter based, I added a counter and run the test on my machine multiple times and

[GitHub] [lucene] matriv commented on a diff in pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
matriv commented on code in PR #11749: URL: https://github.com/apache/lucene/pull/11749#discussion_r963785882 ## lucene/classification/src/test/org/apache/lucene/classification/Test20NewsgroupsClassification.java: ## @@ -123,13 +124,13 @@ public void test20Newsgroups() throws Ex

[GitHub] [lucene] rmuir commented on pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on PR #11749: URL: https://github.com/apache/lucene/pull/11749#issuecomment-1238236097 Thanks for doing this work, this change looks great! avoiding wall-clock time should make tests more reproducible when they fail. also, this change fixes tests that were configured to

[GitHub] [lucene] mayya-sharipova commented on pull request #11743: LUCENE-10592 Better estimate memory for HNSW graph

2022-09-06 Thread GitBox
mayya-sharipova commented on PR #11743: URL: https://github.com/apache/lucene/pull/11743#issuecomment-1238235494 @msokolov > Although ... the RAM needed for the graph was always required, even when building the graph during flush, it just wasn't accounted for I think. I suppose a possib

[GitHub] [lucene] rmuir commented on a diff in pull request #11749: Remove usages of System.currentTimeMillis() from tests

2022-09-06 Thread GitBox
rmuir commented on code in PR #11749: URL: https://github.com/apache/lucene/pull/11749#discussion_r963782724 ## lucene/classification/src/test/org/apache/lucene/classification/Test20NewsgroupsClassification.java: ## @@ -123,13 +124,13 @@ public void test20Newsgroups() throws Exc

[GitHub] [lucene] matriv commented on issue #11459: Remove uses of wall-clock time in codebase [LUCENE-10423]

2022-09-06 Thread GitBox
matriv commented on issue #11459: URL: https://github.com/apache/lucene/issues/11459#issuecomment-1237784068 @rmuir When you have time, please take a look at the PR and let me know, Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please