lukasz-antoniak commented on code in PR #102:
URL: 
https://github.com/apache/cassandra-analytics/pull/102#discussion_r1993301580


##########
cassandra-four-zero-types/build.gradle:
##########
@@ -33,6 +33,7 @@ dependencies {
     compileOnly project(":cassandra-analytics-common")
     compileOnly(project(path: ':cassandra-four-zero', configuration: 'shadow'))
     compileOnly "com.esotericsoftware:kryo-shaded:${kryoVersion}"
+    compileOnly(group: "${sparkGroupId}", name: 
"spark-core_${scalaMajorVersion}", version: 
"${project.rootProject.sparkVersion}")

Review Comment:
   FWIW, I cannot easily move logic of `Duration` to `SparkDuration`, because 
Cassandra serializer expects `org.apache.cassandra.cql3.Duration`. To achieve 
it, I would need to add Cassandra dependency to 
`cassandra-analytics-spark-converter` module. Is that OK?
   
   I am not sure if `toSparkSqlType()` shall return 
`org.apache.cassandra.cql.Duration`, as this is C* type. I guess I might have 
misunderstood your suggestion.
   ```
   @Override
   public Object toSparkSqlType(@NotNull Object value, boolean isFrozen)
   {
       CalendarInterval cl = (CalendarInterval) value;
       return Duration.newInstance(cl.months, cl.days, cl.microseconds * 1000);
   }
   ```
   ```
   Caused by: java.lang.ClassCastException: class 
org.apache.cassandra.cql3.Duration cannot be cast to class 
org.apache.spark.unsafe.types.CalendarInterval 
(org.apache.cassandra.cql3.Duration and 
org.apache.spark.unsafe.types.CalendarInterval are in unnamed module of loader 
'app')
        at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInterval(rows.scala:49)
        at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInterval$(rows.scala:49)
        at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getInterval(rows.scala:195)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
   ```
   
   I have tried to exclude Spark dependency from `cassandra-four-zero-types` 
module in various ways. Best I could come up with, was to introduce a POJO 
`CqlDuration` that is able to map internally to `CalendarInterval`. See commit: 
https://github.com/apache/cassandra-analytics/pull/102/commits/df40c73b4e73728d8455609526314f01e845df13.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to