Re: JIT Shard leader design/proposal

2022-10-17 Thread Mark Miller
Determining the leader is extremely cheap in the general case. It’s when
you have to exchange data (generally when that exchange involves
replication) that’s expensive. Or when you spin up 500 threads for 500
cheap operations. For the common use case, a very basic and long needed
feature in that regard is simple management. Rather then flood the system
at once with 500 replications, there needs to be a gate on how many
expensive operations like that can occur at once. Same with spinning up 500
threads. Maybe basic improvements like that won’t be the ideal end game for
a system that wants 100,000 lazy cores where most of them are rarely
active, but there is always going to be lots of tension trying to solve for
the typical use and a system like that.


Re: JIT Shard leader design/proposal

2022-10-17 Thread David Smiley
I'm trying to understand the needs of a "typical case" better with regards
to this proposed design and how it would be negatively impacted.  Maybe not
at all for NRT as any up-to-date replica can be cheaply made the leader, so
it doesn't matter when.  A TLOG non-leader has to replay (uses a bunch of
threads on the node).  In the proposal, this is work that would be
completely avoided if the node is unavailable for a duration of time short
enough such that there is no indexing.  In the so-called "typical case", I
suppose this could be seen as doing work to prepare ourselves to be able to
index docs right away if one comes in during this period so that we can
optimize for indexing availability / performance instead?  I think this
could easily be a configurable option such that a TLOG replica would
observe the non-availability in its leader so that it might take charge and
be leader eagerly.

> Maybe basic improvements like that

There are already basic node limits for replaying the update log, from what
I see.  replayUpdateThreads mainly.  It defaults to the number of CPU
threads.  Perhaps in systems you see, it's configured to 500?  Based on my
recollection of some replay challenges with document versions & locks that
Dat & I worked on, I could see how increasing it would be helpful.  There
is no cap on the number of replays happening, which I could see us wanting
to do in order to speed up how soon a replica that is already replaying
could become ready.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Oct 17, 2022 at 3:46 AM Mark Miller  wrote:

> Determining the leader is extremely cheap in the general case. It’s when
> you have to exchange data (generally when that exchange involves
> replication) that’s expensive. Or when you spin up 500 threads for 500
> cheap operations. For the common use case, a very basic and long needed
> feature in that regard is simple management. Rather then flood the system
> at once with 500 replications, there needs to be a gate on how many
> expensive operations like that can occur at once. Same with spinning up 500
> threads. Maybe basic improvements like that won’t be the ideal end game for
> a system that wants 100,000 lazy cores where most of them are rarely
> active, but there is always going to be lots of tension trying to solve for
> the typical use and a system like that.
>


Re: Outreachy Internships with Apache

2022-10-17 Thread Joshua Ouma
I have been unwell the past few days but now I am feeling much better and
ready to continue with my contribution

On Sun, Oct 9, 2022 at 7:43 AM David Smiley  wrote:

> Welcome Joshua!  I'm glad you are interested.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sat, Oct 8, 2022 at 1:41 PM Joshua Ouma  wrote:
>
> > Hello All,
> > I am Joshua Ouma, an Outreachy internship applicant. It's a pleasure to
> be
> > part of this community and am looking forward to great interactions as I
> > contribute to Solr and hopefully work on "SOLR-11872 Refactor test infra
> to
> > work with a managed SolrClient; ditch TestHarness".
> >
> > Regards,
> >
> > Joshua Ouma
> >
>


Re: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues and Github Projects, and migrate mailing lists to Github Discussions

2022-10-17 Thread Houston Putman
I'm a big +1 on this idea, just like I was for Lucene's migration.

Also I think that we could very much mooch off of the monumental amounts of
hard work that Tomoko and Mike did for Lucene.

There would certainly still be manual work, and changes to their script
needed, but I don't think it would be as back-breaking of a task.

- Houston

On Fri, Oct 14, 2022 at 1:07 AM Noble Paul  wrote:

> I agree that JIRA is one extra step that is not adding a lot of value.
> Github issues are definitely better
>
> On Fri, Oct 14, 2022 at 3:04 PM David Smiley  wrote:
>
> > Sharing for visibility.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > -- Forwarded message -
> > From: Jeb Nix (Jira) 
> > Date: Mon, Oct 10, 2022 at 7:11 PM
> > Subject: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues and
> > Github Projects, and migrate mailing lists to Github Discussions
> > To: 
> >
> >
> > Jeb Nix created SOLR-16455:
> > --
> >
> >  Summary: Migrate Jira to Github Issues and Github Projects,
> > and migrate mailing lists to Github Discussions
> >  Key: SOLR-16455
> >  URL: https://issues.apache.org/jira/browse/SOLR-16455
> >  Project: Solr
> >   Issue Type: Wish
> >   Security Level: Public (Default Security Level. Issues are Public)
> >   Components: github
> > Reporter: Jeb Nix
> >
> >
> > GitHub is where people are at when they lookup for Solr (or basically any
> > project). Most of the modern projects that have been started with Jira
> and
> > mailing lists have migrated to Github in the last few years. Lucene did
> > that just now for the Issues which has allowed me to explore much more of
> > their issues. GitHub works great and many think that it works even better
> > (I think that there is no doubt that it is working better for the
> > Discussions vs. Mailing lists).
> >
> > I suggest here a pretty heavy move, that personally will allow me to
> start
> > anticipating within Solr's community (since I really don't like the
> mailing
> > lists nor Jira), and I think that there are much more like me out there.
> In
> > my opinion, when the issues are managed on Github, it is much simpler to
> > collaborate and they will get wider exposure since developers are
> spending
> > time on Github anyway (whether if it's for their projects or for looking
> at
> > the actual source code). It is also important to mention that it is
> pretty
> > cumbersome for a new contributor that wants to add stuff to Solr, to talk
> > about this via mail, then translate them to Jira of the issues, and just
> > after that submit a PR on Github. e.g. 3 different systems for each
> > process.
> >
> > Actually, I thought such a great move (for me at least) would never
> happen
> > in Solr in the next years since I didn't think that the community sees &
> > understands the many advantages yet. But now that the Lucene guys did
> this,
> > I believe that it is possible for Solr too.
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.20.10#820010)
> >
> > -
> > To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
> > For additional commands, e-mail: issues-h...@solr.apache.org
> >
>
>
> --
> -
> Noble Paul
>


Re: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues and Github Projects, and migrate mailing lists to Github Discussions

2022-10-17 Thread David Smiley
+1 to migrate.

Yeah.  Maybe Tomoko could validate the steps required?  (CC'ed)  Jeb listed
them in JIRA; the steps/mechanics can be discussed there while we leave
this thread as voting on the major decision.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Oct 17, 2022 at 10:12 AM Houston Putman  wrote:

> I'm a big +1 on this idea, just like I was for Lucene's migration.
>
> Also I think that we could very much mooch off of the monumental amounts of
> hard work that Tomoko and Mike did for Lucene.
>
> There would certainly still be manual work, and changes to their script
> needed, but I don't think it would be as back-breaking of a task.
>
> - Houston
>
> On Fri, Oct 14, 2022 at 1:07 AM Noble Paul  wrote:
>
> > I agree that JIRA is one extra step that is not adding a lot of value.
> > Github issues are definitely better
> >
> > On Fri, Oct 14, 2022 at 3:04 PM David Smiley  wrote:
> >
> > > Sharing for visibility.
> > >
> > > ~ David Smiley
> > > Apache Lucene/Solr Search Developer
> > > http://www.linkedin.com/in/davidwsmiley
> > >
> > >
> > > -- Forwarded message -
> > > From: Jeb Nix (Jira) 
> > > Date: Mon, Oct 10, 2022 at 7:11 PM
> > > Subject: [jira] [Created] (SOLR-16455) Migrate Jira to Github Issues
> and
> > > Github Projects, and migrate mailing lists to Github Discussions
> > > To: 
> > >
> > >
> > > Jeb Nix created SOLR-16455:
> > > --
> > >
> > >  Summary: Migrate Jira to Github Issues and Github
> Projects,
> > > and migrate mailing lists to Github Discussions
> > >  Key: SOLR-16455
> > >  URL: https://issues.apache.org/jira/browse/SOLR-16455
> > >  Project: Solr
> > >   Issue Type: Wish
> > >   Security Level: Public (Default Security Level. Issues are
> Public)
> > >   Components: github
> > > Reporter: Jeb Nix
> > >
> > >
> > > GitHub is where people are at when they lookup for Solr (or basically
> any
> > > project). Most of the modern projects that have been started with Jira
> > and
> > > mailing lists have migrated to Github in the last few years. Lucene did
> > > that just now for the Issues which has allowed me to explore much more
> of
> > > their issues. GitHub works great and many think that it works even
> better
> > > (I think that there is no doubt that it is working better for the
> > > Discussions vs. Mailing lists).
> > >
> > > I suggest here a pretty heavy move, that personally will allow me to
> > start
> > > anticipating within Solr's community (since I really don't like the
> > mailing
> > > lists nor Jira), and I think that there are much more like me out
> there.
> > In
> > > my opinion, when the issues are managed on Github, it is much simpler
> to
> > > collaborate and they will get wider exposure since developers are
> > spending
> > > time on Github anyway (whether if it's for their projects or for
> looking
> > at
> > > the actual source code). It is also important to mention that it is
> > pretty
> > > cumbersome for a new contributor that wants to add stuff to Solr, to
> talk
> > > about this via mail, then translate them to Jira of the issues, and
> just
> > > after that submit a PR on Github. e.g. 3 different systems for each
> > > process.
> > >
> > > Actually, I thought such a great move (for me at least) would never
> > happen
> > > in Solr in the next years since I didn't think that the community sees
> &
> > > understands the many advantages yet. But now that the Lucene guys did
> > this,
> > > I believe that it is possible for Solr too.
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian Jira
> > > (v8.20.10#820010)
> > >
> > > -
> > > To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: issues-h...@solr.apache.org
> > >
> >
> >
> > --
> > -
> > Noble Paul
> >
>


Re: [JENKINS] Solr » Solr-Check-9.x - Build # 2401 - Unstable!

2022-10-17 Thread Kevin Risden
Tracking here: https://issues.apache.org/jira/browse/SOLR-16467

Kevin Risden


On Sun, Oct 16, 2022 at 7:45 PM Kevin Risden  wrote:

> This one reproduces. The issue is the new semver4j library doesn't handle
> strings well - ie: no forbiddenapis so String format uses default locale.
> I'm looking at ways to fix it.
>
> Kevin Risden
>
>
> On Sun, Oct 16, 2022 at 3:28 PM Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> Build: https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/2401/
>>
>> 2 tests failed.
>> FAILED:  org.apache.solr.cloud.PackageManagerCLITest.testPackageManager
>>
>> Error Message:
>> java.lang.AssertionError: Non-zero status returned for: [-solrUrl,
>> http://127.0.0.1:45117/solr, deploy, question-answer, -y, -collections,
>> abc, -p, RH-HANDLER-PATH=/mypath2] expected:<0> but was:<1>
>>
>> Stack Trace:
>> java.lang.AssertionError: Non-zero status returned for: [-solrUrl,
>> http://127.0.0.1:45117/solr, deploy, question-answer, -y, -collections,
>> abc, -p, RH-HANDLER-PATH=/mypath2] expected:<0> but was:<1>
>> at
>> __randomizedtesting.SeedInfo.seed([363283DE41FB7EB9:3598C8B04315A112]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at
>> org.apache.solr.cloud.PackageManagerCLITest.run(PackageManagerCLITest.java:217)
>> at
>> org.apache.solr.cloud.PackageManagerCLITest.testPackageManager(PackageManagerCLITest.java:107)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>> at

Re: [JENKINS] Solr » Solr-Check-9.x - Build # 2395 - Unstable!

2022-10-17 Thread Houston Putman
This reproduces for me on main, even though gradle says the build
succeeded how strange...

org.apache.solr.util.TestSolrVersion > test suite's output saved to
> /Users/houstonputman/dev/oss/solr/solr/main/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.util.TestSolrVersion.txt,
> copied below:
>   2> 1154 INFO  (SUITE-TestSolrVersion-seed#[F0CEB68071A9D9B4]-worker) []
> o.a.s.SolrTestCase Setting 'solr.default.confdir' system property to
> test-framework derived value of
> '/Users/houstonputman/dev/oss/solr/solr/main/solr/server/solr/configsets/_default/conf'
>> java.lang.AssertionError
>> at
> __randomizedtesting.SeedInfo.seed([F0CEB68071A9D9B4:6E2B240BF3B1E717]:0)
>> at org.junit.Assert.fail(Assert.java:87)
>> at org.junit.Assert.assertTrue(Assert.java:42)
>> at org.junit.Assert.assertTrue(Assert.java:53)
>> at
> org.apache.solr.util.TestSolrVersion.testSatisfies(TestSolrVersion.java:52)
>> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>> at
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
>> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>> at
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
>> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
> org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>>

Re: [JENKINS] Solr » Solr-Check-9.x - Build # 2395 - Unstable!

2022-10-17 Thread Kevin Risden
Thanks Houston - based on another seed I found - I think this is
https://issues.apache.org/jira/browse/SOLR-16467

Kevin Risden


On Mon, Oct 17, 2022 at 11:01 AM Houston Putman  wrote:

> This reproduces for me on main, even though gradle says the build
> succeeded how strange...
>
> org.apache.solr.util.TestSolrVersion > test suite's output saved to
> >
> /Users/houstonputman/dev/oss/solr/solr/main/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.util.TestSolrVersion.txt,
> > copied below:
> >   2> 1154 INFO  (SUITE-TestSolrVersion-seed#[F0CEB68071A9D9B4]-worker) []
> > o.a.s.SolrTestCase Setting 'solr.default.confdir' system property to
> > test-framework derived value of
> >
> '/Users/houstonputman/dev/oss/solr/solr/main/solr/server/solr/configsets/_default/conf'
> >> java.lang.AssertionError
> >> at
> > __randomizedtesting.SeedInfo.seed([F0CEB68071A9D9B4:6E2B240BF3B1E717]:0)
> >> at org.junit.Assert.fail(Assert.java:87)
> >> at org.junit.Assert.assertTrue(Assert.java:42)
> >> at org.junit.Assert.assertTrue(Assert.java:53)
> >> at
> >
> org.apache.solr.util.TestSolrVersion.testSatisfies(TestSolrVersion.java:52)
> >> at
> > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> >> at
> >
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >> at
> >
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> at
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> >> at
> >
> org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
> >> at
> >
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> >> at
> >
> org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> >> at
> >
> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> >> at
> >
> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> >> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >> at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> >> at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> >> at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> >> at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >> at
> >
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> >> at
> >
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
> >> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> >> at
> >
> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >> at
> >
> org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> >> at
> >
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.eva

Re: Recent PRS test flakiness

2022-10-17 Thread Chris Hostetter


https://issues.apache.org/jira/browse/SOLR-16425

: Date: Sun, 2 Oct 2022 15:46:25 -0500
: From: Jason Gerlowski 
: Reply-To: dev@solr.apache.org
: To: dev@solr.apache.org
: Subject: Recent PRS test flakiness
: 
: Hey all,
: 
: I noticed this week (after running into a handful of test failures locally)
: that 3 of our 5 flakiest tests (according to fucit) are trying to test
: Solr's "per-replica state" code.  The tests in question are:
: PerReplicaStatesIntegrationTest.testRestart,
: PerReplicaStatesIntegrationTest.testPerReplicaStateCollection, and
: CloudSolrClientTest.testPerReplicaStateCollection.
: 
: All three of these saw a big jump in flakiness between Sept 12 and Sept
: 19.  I spent a bit of time debugging, but didn't get all too far.  In most
: failures, the test times out waiting for a particular number of replicas to
: be reported in ZooKeeper.  I suspect there's a race condition in how we're
: updating our ZK state, but that's as far as I was able to get for now.
: 
: So, a few questions:
: 
: 1. Does anyone know what the root cause of these failures might be, or at
: least what might've caused their flakiness to spike in mid-Sept?
: 2. Should we temporarily @Ignore or @BadApple them to help the builds out a
: bit until someone with context can attend to them?
: 
: Best,
: 
: Jason
: 

-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org