(bunching up references here)
> They use a separate implementation of instance initialization and thus they
> test the test server rather than the real node.
> What is the real gap between the in-JVM tests server instance and a server as
> run by python DTests?
> I also have this concern.
> I think we can get rid of this by extending CassandraDaemon, just need to add
> a few hooks to mock out gossip/internode/client
(/end bunching up references)
Wanted to mention that implementing "regular" instance initialisation should
not be complicated: I know it because I've tried it once. I never got to submit
a patch, but mostly because it was not fully fleshed out in a state that would
meet our quality standards. I'll see if I can finish it up quickly.
That said, vast majority of dtests will not require instance initialisation.
I've checked dtest suite back when we were working on in-jvm dtest API, and
only a few tests (such as bootstrap, some streaming tests, etc) would've
benefitted from this feature.
> So, if you do Cluster.build(num).withConfig(c ->
> c.with(Features.values())).start(), this will run the full Cassandra daemon,
> so the main difference will be with: startup (Instance vs CassandraDaemon),
> and JMX (direct method call rather than JMX).
Right, I think main difference is more or less resolved in `BootstrapTest` by
stubbing out Gossip, but I can see some people being opposed to doing it this
way, hence we should just mimic what CassandraDaemon#activate does and go
through the "normal" startup sequence that starts bootstrap/streaming/etc.
> Given each started instance uses a dedicated class loader there is some
> amount of trash left and when there are a couple of multi-node test cases in
> a single test class, it sometimes happens that the test fail with out of
> memory in metaspace error.
We have also had several discussions about this, and consensus was to just use
a jvm per method in test suite. There still will be tests where we will have to
be more considerate, but these tests would overlap with ones where dtest suite
would also hit the boundaries of the machine, so this is fine.
> I support deprecating python dtests, as long as in-jvm dtests have feature
> parity with python dtests,
We can start by deprecating those dtests where there is feature parity. I'm
sure we all agree we don't need full feature parity between frameworks _before_
we start porting _any_ tests.
I have to mention that migrating to in-jvm dtests will also be hugely
beneficial because we'll be able to use Harry more easily, and many tests that
currently use stress and have no validation will be "simply" migrated to use
Harry and have validation more or less by default. Besides, we'll be able to
avoid using any `for-loop` data generation and generate data using Harry
instead.
On Wed, Mar 30, 2022, at 6:22 PM, David Capwell wrote:
>
>> Outside of this area is there some other difference in the coverage of the
>> tests. Is serialization fully covered?
>> I would like to be sure that we will not miss anything by using in-jvm
>> dtests instead of python dtests.
>
> So, if you do Cluster.build(num).withConfig(c ->
> c.with(Features.values())).start(), this will run the full Cassandra daemon,
> so the main difference will be with: startup (Instance vs CassandraDaemon),
> and JMX (direct method call rather than JMX).
>
> The Features lets you enable different subsystems, so adding every feature
> will get you in-sync mostly.
>
> Now, if you don’t define any Features, then the following are mocked/disabled
> out: Gossip, Internode, Client (disabled)
>
> Gossip: gossiper isn’t running and the gossip state is defined by the
> Cluster, the state does not change for the life of the test (there are
> utility methods to mutate gossip state). Since most tests do not care about
> gossip state changes, this is fine for majority of tests, when not just
> enable gossip with Feature.GOSSIP
> NETWORK: messages do not go over network by default, but will be serialized;
> this logic does not match networking though the serializer is the same. If
> you are actually testing internode messaging, enable it with Feature.NETWORK
> NATIVE_PROTOCOL: by default CQL/thrift protocol are disabled, to enable do
> Feature.NATIVE_PROTOCOL (by default it acts as if
> -Dcassandra.start_native_transport=false)
>
>
>
>> On Mar 30, 2022, at 1:51 AM, Benjamin Lerer wrote:
>>
>>>
>>>
>>> I think we can get rid of this by extending CassandraDaemon, just need to
>>> add a few hooks to mock out gossip/internode/client (for cases where the
>>> mocks are desired), and when mocks are not desired just run the real logic.
>>>
>>> Too many times I have had to make the 2 more in-line, and this is hard to
>>> maintain… we should fix this and feel this is 100% fixable
>>
>> Thanks for the explanation David. Outside of this area is there some other
>> difference in the coverage of the tests. Is serializ