Jon > We can probably skip cassandra-stress, since it looks like easy-cass-stress can be donated. That does need a driver upgrade to support a vector workload, but imo there's no point in investing more in cassandra-stress when we have an alternative with more features available. Not a hill I'm going to die on, just an opportunity to do less work.
That sounds great! I think it's likely that we can manage separate classpaths for the various tools, so we could update the driver dependency in fqltool and cassandra-loader in the meantime and leave cassandra-stress as is if it is going to be superseded. JD, > For the tests, maybe we can have two test class paths for a while? One for driver 3 and one for driver 4? That way we don’t need to migrate them all in a giant big bang patch? They could be moved over a few at a time making review much easier. I'll explore if this is possible as I think that could potentially work and be more manageable. I can also evaluate whether its possible for the two drivers to co-exist on the same classpath (I think that may be the case, but I'm not certain). Thanks, Andy On Wed, Feb 12, 2025 at 6:59 PM J. D. Jordan <jeremiah.jor...@gmail.com> wrote: > Sounds like a reasonable plan to me. +1 > > For the tests, maybe we can have two test class paths for a while? One > for driver 3 and one for driver 4? That way we don’t need to migrate them > all in a giant big bang patch? They could be moved over a few at a time > making review much easier. > > On Feb 12, 2025, at 6:35 PM, Jon Haddad <j...@rustyrazorblade.com> wrote: > > > Hey Andy, > > This seems like a reasonable proposal. > > We can probably skip cassandra-stress, since it looks like > easy-cass-stress can be donated. That does need a driver upgrade to > support a vector workload, but imo there's no point in investing more in > cassandra-stress when we have an alternative with more features available. > Not a hill I'm going to die on, just an opportunity to do less work. > > Jon > > > > > On Wed, Feb 12, 2025 at 3:06 PM Tolbert, Andy <x...@andrewtolbert.com> wrote: > >> Hi All, >> >> I'd like to propose decoupling the java driver as a dependency from the >> core >> Cassandra server code. >> >> I also want to propose a path towards eventually migrating test and tools >> code >> from Apache Cassandra java driver 3.x to 4.x when the time is right for >> the >> project. >> >> Refactoring test code to 4.x is likely to be quite invasive, as I count >> 128 source files utilizing driver code. We'd want to find a good time to >> do >> this to minimize disruption to ongoing development. >> >> Java driver 4.x is effectively a rewrite of the 3.x driver. Its first >> release >> was in March of 2019. While it has similar APIs, it is not binary >> compatible >> with the 3.x driver [1]. >> >> While there hasn't been a clear decision on how the 3.x driver will be >> supported going forward (although we should consider discussing this!), we >> expect and have seen active development take place mostly exclusively >> on the 4.x driver. >> >> It would be useful to migrate to the 4.x driver to test new and future >> features >> of which the 4.x driver will actively support. For example, the 4.x >> driver >> supports Vector types, where the 3.x driver does not. >> >> I've iterated the codebase and identified the following uses of the >> driver: >> >> 0. Core code that uses the driver >> >> * UntypedResultSet uses CodecUtils.fromUnsignedToSignedInt from the driver >> which is just adding Integer.MIN_VALUE to an int so can easily be >> removed. >> * PreparedStatementHelper is used only by dtest fuzz tests to validate >> Prepared Statements. Can be moved to test code. >> * ThreadAwareSecurityManager.checkPermission makes reference to skipping >> checking accessDeclaredMembers due to use of CodecUtils, can probably >> remove >> that with its use removed. >> * sstableloader uses the driver to fetch schema and metadata >> >> 1. Tools that use the driver >> >> * fqltool replay (replaying queries from captured logs) >> * cassandra-stress (making queries to generate load) >> >> 2. Test code >> >> * Understandably, quite a bit of test code uses the driver. This is where >> I >> anticipate the most work would be be needed. >> >> I'd like to propose doing the following: >> >> Can be done now: >> >> * Move sstableloader source into its own tools directly, much like fqltool >> and cassandra-stress. For compatibility, we could retain the existing >> shell >> script entry point (bin/sstableloader). >> * Update remaining core code to remove all use of the driver. As shown >> above, >> there is not much to change here and this should be relatively easy to >> accomplish. >> * Update the build and scripts to establish separate classpaths for the >> server >> and the respective tools. We would exclude the driver and its >> dependencies >> (that aren't required otherwise) from the server. The driver would >> still be >> included in the built package, so this wouldn't reduce the size of the >> binary, but it would remove the driver from the server's classpath, >> which >> would de-risk upgrading the driver and having it or its dependencies >> cause >> possible runtime issues. >> >> To be done next: >> >> * Refactor sstableloader, fqltool and cassandra-stress to use the 4.x >> driver. >> >> To be done when the timing works for the project: >> >> * Refactor tests to use the 4.x driver. >> >> Hopefully this proposed approach makes sense, I'd be eager to hear any >> feedback or suggestions! >> >> Thanks, >> Andy >> >> [1]: >> https://docs.datastax.com/en/developer/java-driver/4.17/upgrade_guide/index.html#4-0-0 >> >