Repository: spark
Updated Branches:
  refs/heads/branch-2.0 2e3ead20c -> e11046457


[SPARK-15649][SQL] Avoid to serialize MetastoreRelation in HiveTableScanExec

## What changes were proposed in this pull request?
in HiveTableScanExec, schema is lazy and is related with relation.attributeMap. 
So it needs to serialize MetastoreRelation when serializing task binary 
bytes.It can avoid to serialize MetastoreRelation.

## How was this patch tested?

Author: Lianhui Wang <[email protected]>

Closes #13397 from lianhuiwang/avoid-serialize.

(cherry picked from commit 2bfc4f15214a870b3e067f06f37eb506b0070a1f)
Signed-off-by: Reynold Xin <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e1104645
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e1104645
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e1104645

Branch: refs/heads/branch-2.0
Commit: e110464571554942bc261ab93ee9e6503bb12516
Parents: 2e3ead2
Author: Lianhui Wang <[email protected]>
Authored: Tue May 31 09:21:51 2016 -0700
Committer: Reynold Xin <[email protected]>
Committed: Tue May 31 09:21:56 2016 -0700

----------------------------------------------------------------------
 .../org/apache/spark/sql/hive/execution/HiveTableScanExec.scala  | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e1104645/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
----------------------------------------------------------------------
diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
index e29864f..cc3e74b 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
@@ -152,8 +152,10 @@ case class HiveTableScanExec(
       }
     }
     val numOutputRows = longMetric("numOutputRows")
+    // Avoid to serialize MetastoreRelation because schema is lazy. (see 
SPARK-15649)
+    val outputSchema = schema
     rdd.mapPartitionsInternal { iter =>
-      val proj = UnsafeProjection.create(schema)
+      val proj = UnsafeProjection.create(outputSchema)
       iter.map { r =>
         numOutputRows += 1
         proj(r)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to