Hi, Is there any reason that Tuple object <https://lucene.apache.org/solr/6_4_2/solr-solrj/org/apache/solr/client/solrj/io/Tuple.html> does not implement Serializable like SolrDocumentBase which does implement <https://lucene.apache.org/solr/6_4_2/solr-solrj/org/apache/solr/common/SolrDocumentBase.html> Serializable?
In spark-solr <https://github.com/LucidWorks/spark-solr> library, I want to return an RDD of Tuple objects but it fails because the Tuple class does not implement Serializable 2017-03-22 01:45:51,230 [Executor task launch worker-0] ERROR Executor - > Exception in task 0.0 in stage 0.0 (TID 0) > java.io.NotSerializableException: org.apache.solr.client.solrj.io.Tuple > Serialization stack: > - object not serializable (class: > org.apache.solr.client.solrj.io.Tuple, value: > org.apache.solr.client.solrj.io.Tuple@365e4da1) > - element of array (index: 0) > - array (class [Lorg.apache.solr.client.solrj.io.Tuple;, size 10) > at > org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:324) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) To get past this error, we need to implement Serializable for Tuple object. Is there a reason not to do that? We are working past this error by doing conversions from Tuple object to other objects but it would be ideal (in terms of performance) if we can just deal with Tuple objects directly in Spark world. Thanks, -- Kiran Chitturi