Repository: spark
Updated Branches:
  refs/heads/master 7f13434a5 -> b943f5d90


[SPARK-4600][GraphX]: org.apache.spark.graphx.VertexRDD.diff does not work

Turns out, per the [convo on the 
JIRA](https://issues.apache.org/jira/browse/SPARK-4600), `diff` is acting 
exactly as should. It became a large misconception as I thought it meant set 
difference, when in fact it does not. To that extent I merely updated the 
`diff` documentation to, hopefully, better reflect its true intentions moving 
forward.

Author: Brennon York <[email protected]>

Closes #5015 from brennonyork/SPARK-4600 and squashes the following commits:

1e1d1e5 [Brennon York] reverted internal diff docs
92288f7 [Brennon York] reverted both the test suite and the diff function back 
to its origin functionality
f428623 [Brennon York] updated diff documentation to better represent its 
function
cc16d65 [Brennon York] Merge remote-tracking branch 'upstream/master' into 
SPARK-4600
66818b9 [Brennon York] added small secondary diff test
99ad412 [Brennon York] Merge remote-tracking branch 'upstream/master' into 
SPARK-4600
74b8c95 [Brennon York] corrected  method by leveraging bitmask operations to 
correctly return only the portions of  that are different from the calling 
VertexRDD
9717120 [Brennon York] updated diff impl to cause fewer objects to be created
710a21c [Brennon York] working diff given test case
aa57f83 [Brennon York] updated to set ShortestPaths to run 'forward' rather 
than 'backward'


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b943f5d9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b943f5d9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b943f5d9

Branch: refs/heads/master
Commit: b943f5d907df0607ecffb729f2bccfa436438d7e
Parents: 7f13434
Author: Brennon York <[email protected]>
Authored: Fri Mar 13 18:48:31 2015 +0000
Committer: Sean Owen <[email protected]>
Committed: Fri Mar 13 18:48:31 2015 +0000

----------------------------------------------------------------------
 graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b943f5d9/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
----------------------------------------------------------------------
diff --git a/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
index 09ae3f9..40ecff7 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
@@ -122,8 +122,11 @@ abstract class VertexRDD[VD](
   def mapValues[VD2: ClassTag](f: (VertexId, VD) => VD2): VertexRDD[VD2]
 
   /**
-   * Hides vertices that are the same between `this` and `other`; for vertices 
that are different,
-   * keeps the values from `other`.
+   * For each vertex present in both `this` and `other`, `diff` returns only 
those vertices with
+   * differing values; for values that are different, keeps the values from 
`other`. This is
+   * only guaranteed to work if the VertexRDDs share a common ancestor.
+   *
+   * @param other the other VertexRDD with which to diff against.
    */
   def diff(other: VertexRDD[VD]): VertexRDD[VD]
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to