Repository: spark Updated Branches: refs/heads/master d2dc8c4a1 -> 126baa8d3
[SPARK-17559][MLLIB] persist edges if their storage level is non in PeriodicGraphCheckpointer ## What changes were proposed in this pull request? When use PeriodicGraphCheckpointer to persist graph, sometimes the edges isn't persisted. As currently only when vertices's storage level is none, graph is persisted. However there is a chance vertices's storage level is not none while edges's is none. Eg. graph created by a outerJoinVertices operation, vertices is automatically cached while edges is not. In this way, edges will not be persisted if we use PeriodicGraphCheckpointer do persist. We need separately check edges's storage level and persisted it if it's none. ## How was this patch tested? manual tests Author: ding <[email protected]> Closes #15124 from dding3/spark-persisitEdge. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/126baa8d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/126baa8d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/126baa8d Branch: refs/heads/master Commit: 126baa8d32bc0e7bf8b43f9efa84f2728f02347d Parents: d2dc8c4 Author: ding <[email protected]> Authored: Tue Oct 4 00:00:10 2016 -0700 Committer: Joseph K. Bradley <[email protected]> Committed: Tue Oct 4 00:00:10 2016 -0700 ---------------------------------------------------------------------- .../org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/126baa8d/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala index 20db608..8007489 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointer.scala @@ -87,7 +87,10 @@ private[mllib] class PeriodicGraphCheckpointer[VD, ED]( override protected def persist(data: Graph[VD, ED]): Unit = { if (data.vertices.getStorageLevel == StorageLevel.NONE) { - data.persist() + data.vertices.persist() + } + if (data.edges.getStorageLevel == StorageLevel.NONE) { + data.edges.persist() } } --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
