Note that the test multiplexer also searches for tests ending with
"Test", so it will also miss new or modified tests with nonstandard
names. The automatically detected tests are listed when running
generate.sh, and they are added to the repeated run jobs. That gives
us another opportunity to detect missed tests when we introduce
something like "MyTestSplit1.java" and see that it's not been included
in the mandatory repeated runs.
On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan
<stefan.mikloso...@netapp.com> wrote:
Hi Berenguer,
I am glad you asked. I was expecting this question.
I think there is a fundamental difference in how we approach this
problem. I do not say mine is better, I just find it important to
describe it clearly.
Let's say we are on a ship which leaks. When I detect that, my
course of action is to fix the leakage with the biggest patches I
can find around with minimal effort necessary so we are not taking
water anymore. There might still be occasional leakages which are
very rare but I have a crew who is checking the ship constantly
anyway (review) and the risk that big leakages like we just fixed
happen again are relatively very low.
You spot a leakage and instead of fixing it with big patches and
calling it a day, you are trying to remodel the cabins so they are
completely waterproof but while doing so, you renamed them so
everyone needs to get used to how these cabins are called and
where they are located, because there is this minimal chance that
some cadet comes around and starts to live in a cabin he is not
supposed to and we need to explain it to him. Thousands of cadets
found their cabins just fine. Occasionally, there are few people
who just miss the right board completely (Test at start, Tests at
the end) and they are automatically navigated around (checkstyle)
but once in five years there is this guy who just completely
missed it.
I believe we can just navigate him when that happens. You want to
cover that guy too.
I just find my approach easier.
You can remodel it all, for sure but I am afraid I will not be a
part of that. I just do not find it necessary to do that.
________________________________________
From: Berenguer Blasi <berenguerbl...@gmail.com>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name
NetApp Security WARNING: This is an external email. Do not click
links or open attachments unless you recognize the sender and know
the content is safe.
IIUC we're relying on catching the word 'Split' in the file name
for option 1. If somebody named his test i.e. 'MyTestGroup1',
'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we
would leak tests again? or any other word that is not specifically
accounted for. Unless I am missing sthg ofc! :-)
On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will
prevent it from happening again when there are tests ending on
"Tests" or starting on "Test". The only case it will not cover is
"SplitN" issue we plan to cover with relaxed test.name
<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests
will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <berenguerbl...@gmail.com>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click
links or open attachments unless you recognize the sender and know
the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the
current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a
bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually
already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed
there.
>>
>> Other cases are either "TestBase" or "Tester" which should be
legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS
adding Split* to test.name <http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it
manually, fix the stuff and from then incorporate checkstyle to
take care of that.
>>
>> There are also some classes which do include @Test methods but
I think they are abstract as they are meant to be extended as the
real test is just wrapping that. This might happen when there are
slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not
click links or open attachments unless you recognize the sender
and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an
orthogonal idea, is there any way to do a one-time audit to narrow
down files that we need to inspect, or any way to automate
confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it
safe to assume that this is the only location for test classes?)
=> 1457 files
>> * Remove any files that already end in "Test.java" => down to
396 files
>> * Is it safe to assume that only files having "Test" somewhere
in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it
be sufficient to look for files that don't end in "Test.java" but
contain "@Test" annotations? That's a significantly smaller set,
so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java
]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done
circleci-parity ✱ ◼
>>
test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>>
test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>>
test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>>
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>>
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan
<stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>
wrote:
>> One more point as I was reading your email in a hurry. The
ultimate goal is to run all tests. So even if we apply the
proposed regexp in checkstyle and there is an unfortunate case
that MyTestSplit1 would slip through on a review and it would not
be executed in CI, if we do not figure out some better regexp, we
might relax "test.name <http://test.name><http://test.name>"
property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would
need to know which .java files contain tests and which not without
looking into them. I do not think that is possible under
circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker
<de...@chen-becker.org<mailto:de...@chen-becker.org>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not
click links or open attachments unless you recognize the sender
and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl"
(which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which
ends on "Test" and nothing else and it may contain "Test" only in
case it does not start with it.
>> Should the regex enforce the "which ends on Test" part?
Otherwise, if we want to allow tests like MyTestSplit1, do we need
to update the "test.name
<http://test.name><http://test.name><http://test.name>" parameter
to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan
<stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>
wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which
ends on "Test" and nothing else and it may contain "Test" only in
case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base
files which are tests (that would be a bug) nor they are
standalone (not inherited). By first approach, TestBaseImpl (which
we do have) would start to be invalid, we would have to rename it
to "DistributedTestBaseImpl" because it was starting on "Test"
which is no-no. I agree that "TestBaseImpl" is rather unfortunate
name. Currently, TestBaseImpl itself extends "DistributedTestBase"
so in the end we would have "DistributedTestBaseImpl ->
TestBaseImpl -> DistributedTestBase". Please keep in mind that
"DistributedTestBase" is located in dtest jar (separate project)
and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker
<de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 15:09
>> To:
dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not
click links or open attachments unless you recognize the sender
and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the
first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we
just trying to reject anything that *starts* with "Test" or
contains "Tests" somewhere in the name? It might be more clear to
state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that
are actually intended to be run as a test? The "Base" would
indicate to me that it's intended to be inherited, not run
directly. Do we have some "TestBase" classes somewhere that aren't
inherited and are meant to be run directly, because that
definitely seems like a misleading name that should be fixed,
invasive or not. I would lean toward the principle of least
surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan
<stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>>
wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI
because of their name (1). There is (2) and (3) so right now it
runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this
in a more robust way - via checkstyle - so we enforce how test
files have to be called.
>>
>> We came up with two approaches and we do not know which one to
choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not
executed right now in CI either) to something like MySplit1Test
and MySplit2Test but the regexp we would eventually use would not
prevent this from happening in the future. I do not know how to
construct such a complex regexp yet which would cover all cases
above plus this splitting one. Feel free to look into the ticket
where details are discussed.
>>
>> The second approach would forbid to have any occurrence of
"test" in the file name but at the end. For example, it would fail
on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this
to be way more invasive to the current codebase because we would
have to rename a lot of files (like, hundreds, plus a lot of
places they are referenced / used at) and I do not even think this
is necessary and desirable. Like, how would we call "FuzzTestBase"
to not contain "Test" in it? I think having "test" string in the
middle is just fine, specifically for helper classes. One possible
fix would be to replace "Test" with "Tester" so we would have
"TesterMemtable", "AuthTesterUtils", "TesterHelper",
"DistributedTesterSnitch" ... Word "Tester" would be explicitly
allowed.
>>
>> I personally lean to the first approach even though cases like
"MyTestSplit1" will not be spotted in compile time, but hey, we
still do have a review process on top of this. The obvious
violations like "TestThisFeature" and "MySuperTests" would be
evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and
trunk. So, while we would fix names of classes in all branches we
support so they are picked up by CI, checkstyle for this would be
introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4)
https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker |
>> | GPG Key available at https://keybase.io/dchenbecker and |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker |
>> | GPG Key available at https://keybase.io/dchenbecker and |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker |
>> | GPG Key available at https://keybase.io/dchenbecker and |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC |
>> +---------------------------------------------------------------+
>>