Jon

> We can probably skip cassandra-stress, since it looks like
easy-cass-stress can be donated.  That does need a driver upgrade to
support a vector workload, but imo there's no point in investing more in
cassandra-stress when we have an alternative with more features available.
Not a hill I'm going to die on, just an opportunity to do less work.

That sounds great!   I think it's likely that we can manage separate
classpaths for the various tools, so we could update the driver dependency
in fqltool and cassandra-loader in the meantime and leave cassandra-stress
as is if it is going to be superseded.

JD,

> For the tests, maybe we can have two test class paths for a while?  One
for driver 3 and one for driver 4?  That way we don’t need to migrate them
all in a giant big bang patch?  They could be moved over a few at a time
making review much easier.

I'll explore if this is possible as I think that could potentially work and
be more manageable.  I can also evaluate whether its possible for the two
drivers to co-exist on the same classpath (I think that may be the case,
but I'm not certain).

Thanks,
Andy


On Wed, Feb 12, 2025 at 6:59 PM J. D. Jordan <jeremiah.jor...@gmail.com>
wrote:

> Sounds like a reasonable plan to me. +1
>
> For the tests, maybe we can have two test class paths for a while?  One
> for driver 3 and one for driver 4?  That way we don’t need to migrate them
> all in a giant big bang patch?  They could be moved over a few at a time
> making review much easier.
>
> On Feb 12, 2025, at 6:35 PM, Jon Haddad <j...@rustyrazorblade.com> wrote:
>
> 
> Hey Andy,
>
> This seems like a reasonable proposal.
>
> We can probably skip cassandra-stress, since it looks like
> easy-cass-stress can be donated.  That does need a driver upgrade to
> support a vector workload, but imo there's no point in investing more in
> cassandra-stress when we have an alternative with more features available.
> Not a hill I'm going to die on, just an opportunity to do less work.
>
> Jon
>
>
>
>
> On Wed, Feb 12, 2025 at 3:06 PM Tolbert, Andy <x...@andrewtolbert.com> wrote:
>
>> Hi All,
>>
>> I'd like to propose decoupling the java driver as a dependency from the
>> core
>> Cassandra server code.
>>
>> I also want to propose a path towards eventually migrating test and tools
>> code
>> from Apache Cassandra java driver 3.x to 4.x when the time is right for
>> the
>> project.
>>
>> Refactoring test code to 4.x is likely to be quite invasive, as I count
>> 128 source files utilizing driver code.  We'd want to find a good time to
>> do
>> this to minimize disruption to ongoing development.
>>
>> Java driver 4.x is effectively a rewrite of the 3.x driver.  Its first
>> release
>> was in March of 2019. While it has similar APIs, it is not binary
>> compatible
>> with the 3.x driver [1].
>>
>> While there hasn't been a clear decision on how the 3.x driver will be
>> supported going forward (although we should consider discussing this!), we
>> expect and have seen active development take place mostly exclusively
>> on the 4.x driver.
>>
>> It would be useful to migrate to the 4.x driver to test new and future
>> features
>> of which the 4.x driver will actively support.  For example, the 4.x
>> driver
>> supports Vector types, where the 3.x driver does not.
>>
>> I've iterated the codebase and identified the following uses of the
>> driver:
>>
>> 0. Core code that uses the driver
>>
>> * UntypedResultSet uses CodecUtils.fromUnsignedToSignedInt from the driver
>>   which is just adding Integer.MIN_VALUE to an int so can easily be
>> removed.
>> * PreparedStatementHelper is used only by dtest fuzz tests to validate
>>   Prepared Statements.  Can be moved to test code.
>> * ThreadAwareSecurityManager.checkPermission makes reference to skipping
>>   checking accessDeclaredMembers due to use of CodecUtils, can probably
>> remove
>>   that with its use removed.
>> * sstableloader uses the driver to fetch schema and metadata
>>
>> 1. Tools that use the driver
>>
>> * fqltool replay (replaying queries from captured logs)
>> * cassandra-stress (making queries to generate load)
>>
>> 2. Test code
>>
>> * Understandably, quite a bit of test code uses the driver. This is where
>> I
>>   anticipate the most work would be be needed.
>>
>> I'd like to propose doing the following:
>>
>> Can be done now:
>>
>> * Move sstableloader source into its own tools directly, much like fqltool
>>   and cassandra-stress.  For compatibility, we could retain the existing
>> shell
>>   script entry point (bin/sstableloader).
>> * Update remaining core code to remove all use of the driver.  As shown
>> above,
>>   there is not much to change here and this should be relatively easy to
>>   accomplish.
>> * Update the build and scripts to establish separate classpaths for the
>> server
>>   and the respective tools.  We would exclude the driver and its
>> dependencies
>>   (that aren't required otherwise) from the server.  The driver would
>> still be
>>   included in the built package, so this wouldn't reduce the size of the
>>   binary, but it would remove the driver from the server's classpath,
>> which
>>   would de-risk upgrading the driver and having it or its dependencies
>> cause
>>   possible runtime issues.
>>
>> To be done next:
>>
>> * Refactor sstableloader, fqltool and cassandra-stress to use the 4.x
>> driver.
>>
>> To be done when the timing works for the project:
>>
>> * Refactor tests to use the 4.x driver.
>>
>> Hopefully this proposed approach makes sense, I'd be eager to hear any
>> feedback or suggestions!
>>
>> Thanks,
>> Andy
>>
>> [1]:
>> https://docs.datastax.com/en/developer/java-driver/4.17/upgrade_guide/index.html#4-0-0
>>
>

Reply via email to