I do not think there is any module in checkstyle which would do what you want.
________________________________________ From: Josh McKenzie <jmcken...@apache.org> Sent: Monday, October 31, 2022 12:49 To: dev Subject: Re: Some tests are never executed in CI due to their name NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. To Derek's point, why not go with: Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? + exclude files that contain "abstract class" in them that matches the file name? Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing. It's also very simple which is a feature. On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote: Hi Berenguer, I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review. FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once. Regards Stefan ________________________________________ From: Berenguer Blasi <berenguerbl...@gmail.com<mailto:berenguerbl...@gmail.com>> Sent: Friday, October 28, 2022 9:43 To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org> Subject: Re: Some tests are never executed in CI due to their name NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi Stefan, seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed. wdyt? On 25/10/22 14:35, Berenguer Blasi wrote: Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have. What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again. So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me). My 2cts! But hey, I've been known to be wrong before :shrug: lol On 25/10/22 12:04, Andrés de la Peña wrote: Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs. On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>> wrote: Hi Berenguer, I am glad you asked. I was expecting this question. I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly. Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low. You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it. I believe we can just navigate him when that happens. You want to cover that guy too. I just find my approach easier. You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that. ________________________________________ From: Berenguer Blasi <berenguerbl...@gmail.com<mailto:berenguerbl...@gmail.com><mailto:berenguerbl...@gmail.com<mailto:berenguerbl...@gmail.com>>> Sent: Tuesday, October 25, 2022 11:08 To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>> Subject: Re: Some tests are never executed in CI due to their name NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-) On 25/10/22 10:39, Miklosovic, Stefan wrote: > I think that what you wrote is not entirely correct. It will prevent it from > happening again when there are tests ending on "Tests" or starting on "Test". > The only case it will not cover is "SplitN" issue we plan to cover with > relaxed test.name<http://test.name> property. > > It seems like what you wrote means that we will fix it and tests will leak > again. That is not true. > > ________________________________________ > From: Berenguer Blasi > <berenguerbl...@gmail.com<mailto:berenguerbl...@gmail.com><mailto:berenguerbl...@gmail.com<mailto:berenguerbl...@gmail.com>>> > Sent: Tuesday, October 25, 2022 7:26 > To: > dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>> > Subject: Re: Some tests are never executed in CI due to their name > > NetApp Security WARNING: This is an external email. Do not click links or > open attachments unless you recognize the sender and know the content is safe. > > > > > The problem with using the first approach is that it fixes the current > situation but it doesn't prevent it from happening again. The second > proposal prevents it from happening again but at the cost of a bigger > rename I'd volunteer to if needed. > > Regards > > On 24/10/22 20:38, Miklosovic, Stefan wrote: >> Yeah, that is what the branch in my original email actually already solved. >> I mean ... >> >> CassandraAuthorizerTruncatingTests >> BatchTests >> UUIDTests >> >> these are ending on Tests which is illegal and they are fixed there. >> >> Other cases are either "TestBase" or "Tester" which should be legal. >> >> I think using the first approach and fixing all "SplitN" PLUS adding Split* >> to test.name<http://test.name> regexp should do it. >> >> I think we can "automate" at most like you suggest and scan it manually, fix >> the stuff and from then incorporate checkstyle to take care of that. >> >> There are also some classes which do include @Test methods but I think they >> are abstract as they are meant to be extended as the real test is just >> wrapping that. This might happen when there are slight variations across >> test classes. This is fine as well. >> >> ________________________________________ >> From: Derek Chen-Becker >> <de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>>> >> Sent: Monday, October 24, 2022 19:18 >> To: >> dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>> >> Subject: Re: Some tests are never executed in CI due to their name >> >> NetApp Security WARNING: This is an external email. Do not click links or >> open attachments unless you recognize the sender and know the content is >> safe. >> >> >> >> OK, I think the approach you're proposing is fine. As an orthogonal idea, is >> there any way to do a one-time audit to narrow down files that we need to >> inspect, or any way to automate confirmation of a file containing tests? For >> example: >> >> * Start with all of the files under the `test` directory (is it safe to >> assume that this is the only location for test classes?) => 1457 files >> * Remove any files that already end in "Test.java" => down to 396 files >> * Is it safe to assume that only files having "Test" somewhere in their name >> are test classes? That reduces the list down to 60 files >> >> Alternatively, are we only talking about unit tests? Would it be sufficient >> to look for files that don't end in "Test.java" but contain "@Test" >> annotations? That's a significantly smaller set, so maybe we could fix unit >> tests first: >> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if >> grep -qF '@Test' $fl; then print $fl; fi; fi; done >> >> circleci-parity ✱ ◼ >> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java >> test/unit/org/apache/cassandra/cql3/BatchTests.java >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java >> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java >> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java >> test/unit/org/apache/cassandra/utils/UUIDTests.java >> >> Cheers, >> >> Derek >> >> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan >> <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>> >> wrote: >> One more point as I was reading your email in a hurry. The ultimate goal is >> to run all tests. So even if we apply the proposed regexp in checkstyle and >> there is an unfortunate case that MyTestSplit1 would slip through on a >> review and it would not be executed in CI, if we do not figure out some >> better regexp, we might relax >> "test.name<http://test.name><http://test.name>" property to include >> Split*.java as well. Yes, this might be also done. >> >> Enforcing "all tests to end on "Test" " means that we would need to know >> which .java files contain tests and which not without looking into them. I >> do not think that is possible under circumstances we have where (checkstyle >> module). >> >> ________________________________________ >> From: Derek Chen-Becker >> <de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>>>> >> Sent: Monday, October 24, 2022 15:53 >> To: >> dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>> >> Subject: Re: Some tests are never executed in CI due to their name >> >> NetApp Security WARNING: This is an external email. Do not click links or >> open attachments unless you recognize the sender and know the content is >> safe. >> >> >> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I >> agree is in need of fixing). When you say: >> >>> What we are trying to achieve by that is to have a test which ends on >>> "Test" and nothing else and it may contain "Test" only in case it does not >>> start with it. >> Should the regex enforce the "which ends on Test" part? Otherwise, if we >> want to allow tests like MyTestSplit1, do we need to update the >> "test.name<http://test.name><http://test.name><http://test.name>" parameter >> to allow that in the test runner? >> >> Thanks, >> >> Derek >> >> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan >> <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>>> >> wrote: >> Hi Derek, >> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$ >> >> What we are trying to achieve by that is to have a test which ends on "Test" >> and nothing else and it may contain "Test" only in case it does not start >> with it. >> >> To your second paragraph, I do not think we have any *Base files which are >> tests (that would be a bug) nor they are standalone (not inherited). By >> first approach, TestBaseImpl (which we do have) would start to be invalid, >> we would have to rename it to "DistributedTestBaseImpl" because it was >> starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather >> unfortunate name. Currently, TestBaseImpl itself extends >> "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl >> -> TestBaseImpl -> DistributedTestBase". Please keep in mind that >> "DistributedTestBase" is located in dtest jar (separate project) and that is >> rather inconvenient to rename. >> >> Regards >> >> ________________________________________ >> From: Derek Chen-Becker >> <de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>>><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org><mailto:de...@chen-becker.org<mailto:de...@chen-becker.org>>>>> >> Sent: Monday, October 24, 2022 15:09 >> To: >> dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org><mailto:dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>>> >> Subject: Re: Some tests are never executed in CI due to their name >> >> NetApp Security WARNING: This is an external email. Do not click links or >> open attachments unless you recognize the sender and know the content is >> safe. >> >> >> >> For clarity, what is the final regex you've landed on for the first >> approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to >> reject anything that *starts* with "Test" or contains "Tests" somewhere in >> the name? It might be more clear to state what we think a valid test name >> should be, not in regex form. >> >> When it comes to "TestBase" files, are there any of these that are actually >> intended to be run as a test? The "Base" would indicate to me that it's >> intended to be inherited, not run directly. Do we have some "TestBase" >> classes somewhere that aren't inherited and are meant to be run directly, >> because that definitely seems like a misleading name that should be fixed, >> invasive or not. I would lean toward the principle of least surprise here, >> even if it means an involved one-time cleanup. >> >> Cheers, >> >> Derek >> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan >> <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com><mailto:stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>>>>>> >> wrote: >> Hi list, >> >> it was reported that some tests are not executed as part of CI because of >> their name (1). There is (2) and (3) so right now it runs only "*Test.java" >> files. >> >> Except of us renaming the affected classes, we should fix this in a more >> robust way - via checkstyle - so we enforce how test files have to be called. >> >> We came up with two approaches and we do not know which one to choose so we >> want to gather more feedback / opinions. >> >> The first approach would make this happen (implemented here (4)) >> >> class name - passes / fails the checkstyle >> >> TestIt - fails >> TestsThatFeature - fails >> ThisTestsMyFeature - fails >> SomeTests - fails >> >> SomeHelperClassForTesting - passes >> SomeTest - passes >> DistributedTestBase - passes >> >> MyTestSplit1 - passes >> >> We would fix "SplitN" test files by hand (I think they are not executed >> right now in CI either) to something like MySplit1Test and MySplit2Test but >> the regexp we would eventually use would not prevent this from happening in >> the future. I do not know how to construct such a complex regexp yet which >> would cover all cases above plus this splitting one. Feel free to look into >> the ticket where details are discussed. >> >> The second approach would forbid to have any occurrence of "test" in the >> file name but at the end. For example, it would fail on classes like these >> >> MyTestSplit1 >> TestHelper >> BurnTestUtil >> FuzzTestBase >> DistributedTestSnitch >> CASCommonTestCases >> ServerTestUtils >> AuthTestUtils >> TestMemtable >> >> While this would also fix "MyTestSplitN" cases, I consider this to be way >> more invasive to the current codebase because we would have to rename a lot >> of files (like, hundreds, plus a lot of places they are referenced / used >> at) and I do not even think this is necessary and desirable. Like, how would >> we call "FuzzTestBase" to not contain "Test" in it? I think having "test" >> string in the middle is just fine, specifically for helper classes. One >> possible fix would be to replace "Test" with "Tester" so we would have >> "TesterMemtable", "AuthTesterUtils", "TesterHelper", >> "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed. >> >> I personally lean to the first approach even though cases like >> "MyTestSplit1" will not be spotted in compile time, but hey, we still do >> have a review process on top of this. The obvious violations like >> "TestThisFeature" and "MySuperTests" would be evaluated as violations which >> I would argue happens the most often. >> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while >> we would fix names of classes in all branches we support so they are picked >> up by CI, checkstyle for this would be introduced in 4.1 and trunk only. >> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964 >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61 >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033 >> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2 >> >> >> -- >> +---------------------------------------------------------------+ >> | Derek Chen-Becker | >> | GPG Key available at https://keybase.io/dchenbecker and | >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >> +---------------------------------------------------------------+ >> >> >> >> -- >> +---------------------------------------------------------------+ >> | Derek Chen-Becker | >> | GPG Key available at https://keybase.io/dchenbecker and | >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >> +---------------------------------------------------------------+ >> >> >> >> -- >> +---------------------------------------------------------------+ >> | Derek Chen-Becker | >> | GPG Key available at https://keybase.io/dchenbecker and | >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >> +---------------------------------------------------------------+ >>