I am using ArangoDB data source for Apache spark connector to read
collection from arangodb and writing to gcp big query.
this is the code -
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType
spark = SparkSession.builder \
.appName("ArangoDBtoBq") \
.master("local[*]") \
.config("spark.jars.packages",
'com.arangodb:arangodb-spark-datasource-3.4_2.12:1.5.0,'
'com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.31.1') \
.config('parentProject', 'abc) \
.getOrCreate()
thisdict = {
"endpoints": "localhost:8529",
"password": "rootpassword",
"database": "database",
"table": "table",
}
df = spark.read.format("com.arangodb.spark").options(**thisdict)\
.schema(schema).load()
df.write.format('bigquery').mode("append") \
.option('table', 'destination') \
.option("project", "project") \
.option("dataset", "dataset") \
.option("writeMethod", "direct") \
.option('credentialsFile', 'credential') \
.save()
but I m getting this error :
com.arangodb.ArangoDBException: Response: 404, Error: 1600 - cursor not
found
at
com.arangodb.internal.util.ResponseUtils.checkError(ResponseUtils.java:53)
at com.arangodb.http.HttpCommunication.execute(HttpCommunication.java:86)
at com.arangodb.http.HttpCommunication.execute(HttpCommunication.java:66)
at com.arangodb.http.HttpProtocol.execute(HttpProtocol.java:44)
at
com.arangodb.internal.ArangoExecutorSync.execute(ArangoExecutorSync.java:60)
at
com.arangodb.internal.ArangoExecutorSync.execute(ArangoExecutorSync.java:48)
at
com.arangodb.internal.ArangoDatabaseImpl$1.close(ArangoDatabaseImpl.java:219)
at
com.arangodb.internal.cursor.ArangoCursorImpl.close(ArangoCursorImpl.java:69)
at
org.apache.spark.sql.arangodb.datasource.reader.ArangoCollectionPartitionReader.close(ArangoCollectionPartitionReader.scala:57)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$advanceToNextIter$1(DataSourceRDD.scala:94)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$advanceToNextIter$1$adapted(DataSourceRDD.scala:89)
at
org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:132)
at
org.apache.spark.TaskContextImpl.$anonfun$invokeTaskCompletionListeners$1(TaskContextImpl.scala:144)
at
org.apache.spark.TaskContextImpl.$anonfun$invokeTaskCompletionListeners$1$adapted(TaskContextImpl.scala:144)
at
org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:199)
how this can be solve?
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/arangodb/56decf6f-f223-495d-981d-41f1e7f3feb3n%40googlegroups.com.