spark git commit: [SPARK-18845][GRAPHX] PageRank has incorrect initialization value that leads to slow convergence

2016-12-15 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 172a52f5d -> 78062b852 [SPARK-18845][GRAPHX] PageRank has incorrect initialization value that leads to slow convergence ## What changes were proposed in this pull request? Change the initial value in all PageRank implementations to be `1.

spark git commit: [SPARK-9436] [GRAPHX] Pregel simplification patch

2015-07-29 Thread ankurdave
OrElse(old) } ``` This can be simplified with one join. ankurdave proposed a patch based on our discussion in the mailing list: https://www.mail-archive.com/devspark.apache.org/msg10316.html Author: Alexander Ulanov Closes #7749 from avulanov/SPARK-9436-pregel and squashes the following c

spark git commit: [SPARK-9109] [GRAPHX] Keep the cached edge in the graph

2015-07-17 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.4 bb1401507 -> f34f3d71f [SPARK-9109] [GRAPHX] Keep the cached edge in the graph The change here is to keep the cached RDDs in the graph object so that when the graph.unpersist() is called these RDDs are correctly unpersisted. ```java i

spark git commit: [SPARK-9109] [GRAPHX] Keep the cached edge in the graph

2015-07-17 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master eba6a1af4 -> 587c315b2 [SPARK-9109] [GRAPHX] Keep the cached edge in the graph The change here is to keep the cached RDDs in the graph object so that when the graph.unpersist() is called these RDDs are correctly unpersisted. ```java impor

spark git commit: [SPARK-8718] [GRAPHX] Improve EdgePartition2D for non perfect square number of partitions

2015-07-14 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master d267c2834 -> 0a4071eab [SPARK-8718] [GRAPHX] Improve EdgePartition2D for non perfect square number of partitions See https://github.com/aray/e2d/blob/master/EdgePartition2D.ipynb Author: Andrew Ray Closes #7104 from aray/edge-partition-

spark git commit: [SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error

2015-04-07 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 6f0d55d76 -> ae980eb41 [SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error Example of Graph#aggregateMessages has error. Since aggregateMessages is a method of Graph, It should be written "rawGraph.aggregateMessages" Aut

spark git commit: [SPARK-6510][GraphX]: Add Graph#minus method to act as Set#difference

2015-03-26 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master aad003227 -> 39fb57968 [SPARK-6510][GraphX]: Add Graph#minus method to act as Set#difference Adds a `Graph#minus` method which will return only unique `VertexId`'s from the calling `VertexRDD`. To demonstrate a basic example with pseudoco

spark git commit: [SPARK-5922][GraphX]: Add diff(other: RDD[VertexId, VD]) in VertexRDD

2015-03-16 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master aa6536fa3 -> 45f4c6612 [SPARK-5922][GraphX]: Add diff(other: RDD[VertexId, VD]) in VertexRDD Changed method invocation of 'diff' to match that of 'innerJoin' and 'leftJoin' from VertexRDD[VD] to RDD[(VertexId, VD)]. This change maintains b

spark git commit: [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing

2015-02-25 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 a9abcaa2c -> 00112baf9 [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or `leftJoin`ed and have different partition sizes they fail under the `

spark git commit: [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing

2015-02-25 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master a777c65da -> 9f603fce7 [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or `leftJoin`ed and have different partition sizes they fail under the `zipP

spark git commit: [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing

2015-02-25 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.3 eaffc6edd -> 8073767f5 [SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or `leftJoin`ed and have different partition sizes they fail under the `

spark git commit: SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus

2015-02-14 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.3 152147f5f -> db5747921 SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus This just unpersist()s each RDD in this code that was cache()ed. Author: Sean Owen Closes #4234 from srowen/SPARK-3290 and squashes the following commits:

spark git commit: SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus

2015-02-14 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master d06d5ee9b -> 0ce4e430a SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus This just unpersist()s each RDD in this code that was cache()ed. Author: Sean Owen Closes #4234 from srowen/SPARK-3290 and squashes the following commits: 66c

spark git commit: [SPARK-5343][GraphX]: ShortestPaths traverses backwards

2015-02-10 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.3 bba095399 -> 5be8902f7 [SPARK-5343][GraphX]: ShortestPaths traverses backwards Corrected the logic with ShortestPaths so that the calculation will run forward rather than backwards. Output before looked like: ```scala import org.apach

spark git commit: [SPARK-5343][GraphX]: ShortestPaths traverses backwards

2015-02-10 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master fd2c032f9 -> 582096128 [SPARK-5343][GraphX]: ShortestPaths traverses backwards Corrected the logic with ShortestPaths so that the calculation will run forward rather than backwards. Output before looked like: ```scala import org.apache.sp

spark git commit: [SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImp...

2015-01-23 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 2ea782a9d -> 73cb806f7 [SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImp... If the value of 'spark.default.parallelism' does not match the number of partitoins in EdgePartition(EdgeRDDImpl),

spark git commit: [SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImp...

2015-01-23 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master cef1f092a -> e224dbb01 [SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner of EdgeRDDImp... If the value of 'spark.default.parallelism' does not match the number of partitoins in EdgePartition(EdgeRDDImpl), the

spark git commit: [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop

2015-01-21 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 e90f6b5c6 -> 37db20c94 [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop I looked into GraphGenerators#chooseCell, and found that chooseCell can't generate more edges than pow(2

spark git commit: [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop

2015-01-21 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 7450a992b -> 3ee3ab592 [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop I looked into GraphGenerators#chooseCell, and found that chooseCell can't generate more edges than pow(2, (2

spark git commit: [SPARK-4917] Add a function to convert into a graph with canonical edges in GraphOps

2015-01-08 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 8d45834de -> f825e193f [SPARK-4917] Add a function to convert into a graph with canonical edges in GraphOps Convert bi-directional edges into uni-directional ones instead of 'canonicalOrientation' in GraphLoader.edgeListFile. This functio

spark git commit: [Minor] Fix comments for GraphX 2D partitioning strategy

2015-01-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master a6394bc2c -> 5e3ec1110 [Minor] Fix comments for GraphX 2D partitioning strategy The sum of vertices on matrix (v0 to v11) is 12. And, I think one same block overlaps in this strategy. This is minor PR, so I didn't file in JIRA. Author: k

spark git commit: [SPARK-4620] Add unpersist in Graph and GraphImpl

2014-12-07 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 a4ae7c8b5 -> 6b9e8b081 [SPARK-4620] Add unpersist in Graph and GraphImpl Add an IF to uncache both vertices and edges of Graph/GraphImpl. This IF is useful when iterative graph operations build a new graph in each iteration, and the ve

spark git commit: [SPARK-4620] Add unpersist in Graph and GraphImpl

2014-12-07 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 2e6b736b0 -> 8817fc7fe [SPARK-4620] Add unpersist in Graph and GraphImpl Add an IF to uncache both vertices and edges of Graph/GraphImpl. This IF is useful when iterative graph operations build a new graph in each iteration, and the vertic

spark git commit: [SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark

2014-12-07 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 27d9f13af -> a4ae7c8b5 [SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark This patch just replaces a native quick sorter with Sorter(TimSort) in Spark. It could get performance gains by ~8% in my quick exper

spark git commit: [SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark

2014-12-07 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master e895e0cbe -> 2e6b736b0 [SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark This patch just replaces a native quick sorter with Sorter(TimSort) in Spark. It could get performance gains by ~8% in my quick experimen

spark git commit: [SPARK-3623][GraphX] GraphX should support the checkpoint operation

2014-12-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 11446a648 -> 27d9f13af [SPARK-3623][GraphX] GraphX should support the checkpoint operation Author: GuoQiang Li Closes #2631 from witgo/SPARK-3623 and squashes the following commits: a70c500 [GuoQiang Li] Remove java related 4d1e249 [

spark git commit: [SPARK-3623][GraphX] GraphX should support the checkpoint operation

2014-12-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 6eb1b6f62 -> e895e0cbe [SPARK-3623][GraphX] GraphX should support the checkpoint operation Author: GuoQiang Li Closes #2631 from witgo/SPARK-3623 and squashes the following commits: a70c500 [GuoQiang Li] Remove java related 4d1e249 [GuoQ

spark git commit: [SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain

2014-12-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 528cce8bc -> 667f7ff44 [SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contain

spark git commit: [SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain

2014-12-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 17c162f66 -> 77be8b986 [SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contains a

spark git commit: [SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error

2014-12-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 f1859fc18 -> 528cce8bc [SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and Ver

spark git commit: [SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error

2014-12-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master fc0a1475e -> 17c162f66 [SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and VertexR

spark git commit: [SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage

2014-12-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 5e026a3e6 -> f1859fc18 [SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 Iterative GraphX applications always have long lineage, while

spark git commit: [SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage

2014-12-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 5da21f07d -> fc0a1475e [SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672 Iterative GraphX applications always have long lineage, while chec

spark git commit: [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx

2014-11-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.0 49224fd0f -> 76c20cac9 [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId to currSrcId Author: lianhuiwang Closes #3138 from lianhuiwang

spark git commit: [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx

2014-11-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 c58c1bb83 -> 0a40eac25 [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId to currSrcId Author: lianhuiwang Closes #3138 from lianhuiwang

spark git commit: [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx

2014-11-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 aaaeaf939 -> 9061bc4e1 [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId to currSrcId Author: lianhuiwang Closes #3138 from lianhuiwang

spark git commit: [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx

2014-11-06 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 23eaf0e12 -> d15c6e9dc [SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId to currSrcId Author: lianhuiwang Closes #3138 from lianhuiwang/SPA

git commit: [GraphX] Modify option name according to example doc in SynthBenchmark

2014-10-24 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master f80dcf2ae -> 07e439b4f [GraphX] Modify option name according to example doc in SynthBenchmark Now graphx.SynthBenchmark example has an option of iteration number named as "niter". However, in its document, it is named as "niters". The mism

git commit: SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment

2014-10-12 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 18bd67c24 -> e5be4de7b SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment Previously, when the val partitionStrategy was created it called a function in the Analytics object which was a copy of the PartitionStrateg

git commit: SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment

2014-10-12 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 18ef22ab7 -> 5a21e3e7e SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment Previously, when the val partitionStrategy was created it called a function in the Analytics object which was a copy of the PartitionStr

git commit: Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635

2014-09-29 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 e5ab11387 -> 85dd5139e Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635 Author: oded Closes #2486 from odedz/master and squashes the following commits: dd7890a [oded] Fixed the condition in StronglyConnectedCompon

git commit: Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635

2014-09-29 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 51229ff7f -> dc30e4504 Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635 Author: oded Closes #2486 from odedz/master and squashes the following commits: dd7890a [oded] Fixed the condition in StronglyConnectedComponents

git commit: [graphX] GraphOps: random pick vertex bug

2014-09-29 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 df5a62f51 -> e5ab11387 [graphX] GraphOps: random pick vertex bug When `numVertices > 50`, probability is set to 0. This would cause infinite loop. Author: yingjieMiao Closes #2553 from yingjieMiao/graphx and squashes the following c

git commit: [graphX] GraphOps: random pick vertex bug

2014-09-29 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 0bbe7faef -> 51229ff7f [graphX] GraphOps: random pick vertex bug When `numVertices > 50`, probability is set to 0. This would cause infinite loop. Author: yingjieMiao Closes #2553 from yingjieMiao/graphx and squashes the following commi

git commit: [SPARK-2062][GraphX] VertexRDD.apply does not use the mergeFunc

2014-09-18 Thread ankurdave
rry Xiao Author: Blie Arkansol Author: Ankur Dave Closes #1903 from larryxiao/2062 and squashes the following commits: 625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062 476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on default values

git commit: [SPARK-2062][GraphX] VertexRDD.apply does not use the mergeFunc

2014-09-18 Thread ankurdave
iao Author: Blie Arkansol Author: Ankur Dave Closes #1903 from larryxiao/2062 and squashes the following commits: 625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062 476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on default values 614059

git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
nt for others. Author: Ankur Dave Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits: 10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-

git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
velopment for others. Author: Ankur Dave Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits: 10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" (cherry picked from commit 00362dac976cd05b06638deb11d990d612429e0b) Si

git commit: [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"

2014-09-03 Thread ankurdave
velopment for others. Author: Ankur Dave Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits: 10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions" (cherry picked from commit 00362dac976cd05b06638deb11d990d612429e0b) Si

git commit: [SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGraph in PR #720

2014-09-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 6481d2742 -> e5d376801 [SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGraph in PR #720 PR #720 made multiple changes to GraphGenerator.logNormalGraph including: * Replacing the call to functions for generating random ver

git commit: [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master e9bb12bea -> 9b225ac30 [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumentExc

git commit: [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.0 d60f60ccc -> d47581638 [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumen

git commit: [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 0c8183cb3 -> ffdb2fcf8 [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition number, GraphX jobs will throw: java.lang.IllegalArgumen

git commit: [SPARK-2981][GraphX] EdgePartition1D Int overflow

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.0 5481196ab -> d60f60ccc [SPARK-2981][GraphX] EdgePartition1D Int overflow minor fix detail is here: https://issues.apache.org/jira/browse/SPARK-2981 Author: Larry Xiao Closes #1902 from larryxiao/2981 and squashes the following commit

git commit: [SPARK-2981][GraphX] EdgePartition1D Int overflow

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.1 7267e402c -> 9b0cff2d4 [SPARK-2981][GraphX] EdgePartition1D Int overflow minor fix detail is here: https://issues.apache.org/jira/browse/SPARK-2981 Author: Larry Xiao Closes #1902 from larryxiao/2981 and squashes the following commit

git commit: [SPARK-2981][GraphX] EdgePartition1D Int overflow

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 7c9bbf172 -> aa7de128c [SPARK-2981][GraphX] EdgePartition1D Int overflow minor fix detail is here: https://issues.apache.org/jira/browse/SPARK-2981 Author: Larry Xiao Closes #1902 from larryxiao/2981 and squashes the following commits:

git commit: [SPARK-3123][GraphX]: override the "setName" function to set EdgeRDD's name manually just as VertexRDD does.

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 7c92b49d6 -> 7c9bbf172 [SPARK-3123][GraphX]: override the "setName" function to set EdgeRDD's name manually just as VertexRDD does. Author: uncleGen Closes #2033 from uncleGen/master_origin and squashes the following commits: 801994b [u

git commit: [SPARK-1986][GraphX]move lib.Analytics to org.apache.spark.examples

2014-09-02 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 644e31524 -> 7c92b49d6 [SPARK-1986][GraphX]move lib.Analytics to org.apache.spark.examples to support ~/spark/bin/run-example GraphXAnalytics triangles /soc-LiveJournal1.txt --numEPart=256 Author: Larry Xiao Closes #1766 from larryxiao/1

git commit: Graphx example

2014-07-22 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 79fe7634f -> 5f7b99168 Graphx example fix examples Author: CrazyJvm Closes #1523 from CrazyJvm/graphx-example and squashes the following commits: 663457a [CrazyJvm] outDegrees does not take parameters 7cfff1d [CrazyJvm] fix example for

svn commit: r1607545 - in /spark: images/graphx-perf-comparison.png site/images/graphx-perf-comparison.png

2014-07-03 Thread ankurdave
Author: ankurdave Date: Thu Jul 3 07:08:46 2014 New Revision: 1607545 URL: http://svn.apache.org/r1607545 Log: Correct the GraphX performance comparison graphic Modified: spark/images/graphx-perf-comparison.png spark/site/images/graphx-perf-comparison.png Modified: spark/images/graphx

git commit: Minor: Fix documentation error from apache/spark#946

2014-06-04 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 11ded3f66 -> abea2d4ff Minor: Fix documentation error from apache/spark#946 Author: Ankur Dave Closes #970 from ankurdave/SPARK-1991_docfix and squashes the following commits: 6d07343 [Ankur Dave] Minor: Fix documentation error f

git commit: Enable repartitioning of graph over different number of partitions

2014-06-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master e8d93ee52 -> 5284ca78d Enable repartitioning of graph over different number of partitions It is currently very difficult to repartition a graph over a different number of partitions. This PR adds an additional `partitionBy` function that

git commit: Synthetic GraphX Benchmark

2014-06-03 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master aa41a522d -> 894ecde04 Synthetic GraphX Benchmark This PR accomplishes two things: 1. It introduces a Synthetic Benchmark application that generates an arbitrarily large log-normal graph and executes either PageRank or connected componen

git commit: initial version of LPA

2014-05-29 Thread ankurdave
rks, LPA is perhaps the most elementary, and despite its flaws it remains a nice and simple approach. Author: Ankur Dave Author: haroldsultan Author: Harold Sultan Closes #905 from haroldsultan/master and squashes the following commits: 327aee0 [haroldsultan] Merge pull request #2 from ankurd