[jira] [Commented] (LUCENE-9817) pathological test fixes

Simon Willnauer (Jira) Tue, 16 Mar 2021 03:33:06 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302410#comment-17302410
 ]


Simon Willnauer commented on LUCENE-9817:
-----------------------------------------

thanks rob for taking the time to do all this analysis. I do wonder if some 
tests should be @nightly only for N-2 indices or if we can take a random list 
of versions we test in each of these tests to make sure we have more reliable 
times even with more versions released?

> pathological test fixes
> -----------------------
>
>                 Key: LUCENE-9817
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9817
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9817.patch, LUCENE-9817.patch, LUCENE-9817.patch
>
>
> There are now 13,000+ tests in lucene, and if you don't have dozens of cores 
> the situation is slow (around 7 minutes here, with everything tuned as fast 
> as i can get it, running on tmpfs). 
> It is tricky to keep the situation sustainable: so many tests that usually 
> just take a few seconds but they all add up. To put it in perspective, 
> imagine if all 13000 tests only took 1s each, that's 3.5 hours of cpu time.
> From my inspection, there are a few cases of inefficiency:
> * tests with bad random parameters: they might normally be semi-well-behaved, 
> but "rarely" take 30 seconds. That's maybe like a 1% chance but keep in mind 
> 1% equates to 130 wild-west tests every run.
> * tests spinning up too many threads and indexing too many docs 
> unnecessarily: there might literally be thousands of these, so that's a hard 
> problem to fix... and developers love to use lots of threads and docs in 
> tests.
> * tests just being inefficient: stuff like creating indexes in setup/teardown 
> when they have many methods that may not even use them (hey, why did 
> testEqualsHashcode take 30 seconds, what is it doing?)
> I only worked on the first case here, if i fixed anything involving the other 
> two, it was just because I noticed them while I was there. I temporarily 
> overrode methods like LuceneTestCase.rarely(), atLeast(), and so on to 
> present more pathological/worst-case conditions and tried to address them all.
> So here's a patch to give ~ 80 seconds of cpu-time in tests back. YMMV, maybe 
> it helps you more if you are actually using hard disks and stuff!
> Fixing the other issues here will require some more creativity/work, I will 
> followup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9817) pathological test fixes

Reply via email to