Repository: spark
Updated Branches:
refs/heads/master 172a52f5d -> 78062b852
[SPARK-18845][GRAPHX] PageRank has incorrect initialization value that leads to
slow convergence
## What changes were proposed in this pull request?
Change the initial value in all PageRank implementations to be `1.
OrElse(old) }
```
This can be simplified with one join. ankurdave proposed a patch based on our
discussion in the mailing list:
https://www.mail-archive.com/devspark.apache.org/msg10316.html
Author: Alexander Ulanov
Closes #7749 from avulanov/SPARK-9436-pregel and squashes the following c
Repository: spark
Updated Branches:
refs/heads/branch-1.4 bb1401507 -> f34f3d71f
[SPARK-9109] [GRAPHX] Keep the cached edge in the graph
The change here is to keep the cached RDDs in the graph object so that when the
graph.unpersist() is called these RDDs are correctly unpersisted.
```java
i
Repository: spark
Updated Branches:
refs/heads/master eba6a1af4 -> 587c315b2
[SPARK-9109] [GRAPHX] Keep the cached edge in the graph
The change here is to keep the cached RDDs in the graph object so that when the
graph.unpersist() is called these RDDs are correctly unpersisted.
```java
impor
Repository: spark
Updated Branches:
refs/heads/master d267c2834 -> 0a4071eab
[SPARK-8718] [GRAPHX] Improve EdgePartition2D for non perfect square number of
partitions
See https://github.com/aray/e2d/blob/master/EdgePartition2D.ipynb
Author: Andrew Ray
Closes #7104 from aray/edge-partition-
Repository: spark
Updated Branches:
refs/heads/master 6f0d55d76 -> ae980eb41
[SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error
Example of Graph#aggregateMessages has error.
Since aggregateMessages is a method of Graph, It should be written
"rawGraph.aggregateMessages"
Aut
Repository: spark
Updated Branches:
refs/heads/master aad003227 -> 39fb57968
[SPARK-6510][GraphX]: Add Graph#minus method to act as Set#difference
Adds a `Graph#minus` method which will return only unique `VertexId`'s from the
calling `VertexRDD`.
To demonstrate a basic example with pseudoco
Repository: spark
Updated Branches:
refs/heads/master aa6536fa3 -> 45f4c6612
[SPARK-5922][GraphX]: Add diff(other: RDD[VertexId, VD]) in VertexRDD
Changed method invocation of 'diff' to match that of 'innerJoin' and 'leftJoin'
from VertexRDD[VD] to RDD[(VertexId, VD)]. This change maintains b
Repository: spark
Updated Branches:
refs/heads/branch-1.2 a9abcaa2c -> 00112baf9
[SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing
Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or
`leftJoin`ed and have different partition sizes they fail under the
`
Repository: spark
Updated Branches:
refs/heads/master a777c65da -> 9f603fce7
[SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing
Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or
`leftJoin`ed and have different partition sizes they fail under the
`zipP
Repository: spark
Updated Branches:
refs/heads/branch-1.3 eaffc6edd -> 8073767f5
[SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing
Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or
`leftJoin`ed and have different partition sizes they fail under the
`
Repository: spark
Updated Branches:
refs/heads/branch-1.3 152147f5f -> db5747921
SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
This just unpersist()s each RDD in this code that was cache()ed.
Author: Sean Owen
Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/master d06d5ee9b -> 0ce4e430a
SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
This just unpersist()s each RDD in this code that was cache()ed.
Author: Sean Owen
Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
66c
Repository: spark
Updated Branches:
refs/heads/branch-1.3 bba095399 -> 5be8902f7
[SPARK-5343][GraphX]: ShortestPaths traverses backwards
Corrected the logic with ShortestPaths so that the calculation will run forward
rather than backwards. Output before looked like:
```scala
import org.apach
Repository: spark
Updated Branches:
refs/heads/master fd2c032f9 -> 582096128
[SPARK-5343][GraphX]: ShortestPaths traverses backwards
Corrected the logic with ShortestPaths so that the calculation will run forward
rather than backwards. Output before looked like:
```scala
import org.apache.sp
Repository: spark
Updated Branches:
refs/heads/branch-1.2 2ea782a9d -> 73cb806f7
[SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner
of EdgeRDDImp...
If the value of 'spark.default.parallelism' does not match the number of
partitoins in EdgePartition(EdgeRDDImpl),
Repository: spark
Updated Branches:
refs/heads/master cef1f092a -> e224dbb01
[SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a partitioner
of EdgeRDDImp...
If the value of 'spark.default.parallelism' does not match the number of
partitoins in EdgePartition(EdgeRDDImpl),
the
Repository: spark
Updated Branches:
refs/heads/branch-1.2 e90f6b5c6 -> 37db20c94
[SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph
generator to prevent infinite loop
I looked into GraphGenerators#chooseCell, and found that chooseCell can't
generate more edges than pow(2
Repository: spark
Updated Branches:
refs/heads/master 7450a992b -> 3ee3ab592
[SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph
generator to prevent infinite loop
I looked into GraphGenerators#chooseCell, and found that chooseCell can't
generate more edges than pow(2, (2
Repository: spark
Updated Branches:
refs/heads/master 8d45834de -> f825e193f
[SPARK-4917] Add a function to convert into a graph with canonical edges in
GraphOps
Convert bi-directional edges into uni-directional ones instead of
'canonicalOrientation' in GraphLoader.edgeListFile.
This functio
Repository: spark
Updated Branches:
refs/heads/master a6394bc2c -> 5e3ec1110
[Minor] Fix comments for GraphX 2D partitioning strategy
The sum of vertices on matrix (v0 to v11) is 12. And, I think one same block
overlaps in this strategy.
This is minor PR, so I didn't file in JIRA.
Author: k
Repository: spark
Updated Branches:
refs/heads/branch-1.2 a4ae7c8b5 -> 6b9e8b081
[SPARK-4620] Add unpersist in Graph and GraphImpl
Add an IF to uncache both vertices and edges of Graph/GraphImpl.
This IF is useful when iterative graph operations build a new graph in each
iteration, and the ve
Repository: spark
Updated Branches:
refs/heads/master 2e6b736b0 -> 8817fc7fe
[SPARK-4620] Add unpersist in Graph and GraphImpl
Add an IF to uncache both vertices and edges of Graph/GraphImpl.
This IF is useful when iterative graph operations build a new graph in each
iteration, and the vertic
Repository: spark
Updated Branches:
refs/heads/branch-1.2 27d9f13af -> a4ae7c8b5
[SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark
This patch just replaces a native quick sorter with Sorter(TimSort) in Spark.
It could get performance gains by ~8% in my quick exper
Repository: spark
Updated Branches:
refs/heads/master e895e0cbe -> 2e6b736b0
[SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort) in Spark
This patch just replaces a native quick sorter with Sorter(TimSort) in Spark.
It could get performance gains by ~8% in my quick experimen
Repository: spark
Updated Branches:
refs/heads/branch-1.2 11446a648 -> 27d9f13af
[SPARK-3623][GraphX] GraphX should support the checkpoint operation
Author: GuoQiang Li
Closes #2631 from witgo/SPARK-3623 and squashes the following commits:
a70c500 [GuoQiang Li] Remove java related
4d1e249 [
Repository: spark
Updated Branches:
refs/heads/master 6eb1b6f62 -> e895e0cbe
[SPARK-3623][GraphX] GraphX should support the checkpoint operation
Author: GuoQiang Li
Closes #2631 from witgo/SPARK-3623 and squashes the following commits:
a70c500 [GuoQiang Li] Remove java related
4d1e249 [GuoQ
Repository: spark
Updated Branches:
refs/heads/branch-1.2 528cce8bc -> 667f7ff44
[SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contain
Repository: spark
Updated Branches:
refs/heads/master 17c162f66 -> 77be8b986
[SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contains a
Repository: spark
Updated Branches:
refs/heads/branch-1.2 f1859fc18 -> 528cce8bc
[SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow
error
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and Ver
Repository: spark
Updated Branches:
refs/heads/master fc0a1475e -> 17c162f66
[SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow
error
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and VertexR
Repository: spark
Updated Branches:
refs/heads/branch-1.2 5e026a3e6 -> f1859fc18
[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
Iterative GraphX applications always have long lineage, while
Repository: spark
Updated Branches:
refs/heads/master 5da21f07d -> fc0a1475e
[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
Iterative GraphX applications always have long lineage, while chec
Repository: spark
Updated Branches:
refs/heads/branch-1.0 49224fd0f -> 76c20cac9
[SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx
at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId
to currSrcId
Author: lianhuiwang
Closes #3138 from lianhuiwang
Repository: spark
Updated Branches:
refs/heads/branch-1.1 c58c1bb83 -> 0a40eac25
[SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx
at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId
to currSrcId
Author: lianhuiwang
Closes #3138 from lianhuiwang
Repository: spark
Updated Branches:
refs/heads/branch-1.2 aaaeaf939 -> 9061bc4e1
[SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx
at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId
to currSrcId
Author: lianhuiwang
Closes #3138 from lianhuiwang
Repository: spark
Updated Branches:
refs/heads/master 23eaf0e12 -> d15c6e9dc
[SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx
at first srcIds is not initialized and are all 0. so we use edgeArray(0).srcId
to currSrcId
Author: lianhuiwang
Closes #3138 from lianhuiwang/SPA
Repository: spark
Updated Branches:
refs/heads/master f80dcf2ae -> 07e439b4f
[GraphX] Modify option name according to example doc in SynthBenchmark
Now graphx.SynthBenchmark example has an option of iteration number named as
"niter". However, in its document, it is named as "niters". The mism
Repository: spark
Updated Branches:
refs/heads/master 18bd67c24 -> e5be4de7b
SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment
Previously, when the val partitionStrategy was created it called a function in
the Analytics object which was a copy of the PartitionStrateg
Repository: spark
Updated Branches:
refs/heads/branch-1.1 18ef22ab7 -> 5a21e3e7e
SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assignment
Previously, when the val partitionStrategy was created it called a function in
the Analytics object which was a copy of the PartitionStr
Repository: spark
Updated Branches:
refs/heads/branch-1.1 e5ab11387 -> 85dd5139e
Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635
Author: oded
Closes #2486 from odedz/master and squashes the following commits:
dd7890a [oded] Fixed the condition in StronglyConnectedCompon
Repository: spark
Updated Branches:
refs/heads/master 51229ff7f -> dc30e4504
Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635
Author: oded
Closes #2486 from odedz/master and squashes the following commits:
dd7890a [oded] Fixed the condition in StronglyConnectedComponents
Repository: spark
Updated Branches:
refs/heads/branch-1.1 df5a62f51 -> e5ab11387
[graphX] GraphOps: random pick vertex bug
When `numVertices > 50`, probability is set to 0. This would cause infinite
loop.
Author: yingjieMiao
Closes #2553 from yingjieMiao/graphx and squashes the following c
Repository: spark
Updated Branches:
refs/heads/master 0bbe7faef -> 51229ff7f
[graphX] GraphOps: random pick vertex bug
When `numVertices > 50`, probability is set to 0. This would cause infinite
loop.
Author: yingjieMiao
Closes #2553 from yingjieMiao/graphx and squashes the following commi
rry Xiao
Author: Blie Arkansol
Author: Ankur Dave
Closes #1903 from larryxiao/2062 and squashes the following commits:
625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062
476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on
default values
iao
Author: Blie Arkansol
Author: Ankur Dave
Closes #1903 from larryxiao/2062 and squashes the following commits:
625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062
476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on
default values
614059
nt for others.
Author: Ankur Dave
Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits:
10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD
zipPartitions"
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-
velopment for others.
Author: Ankur Dave
Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits:
10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD
zipPartitions"
(cherry picked from commit 00362dac976cd05b06638deb11d990d612429e0b)
Si
velopment for others.
Author: Ankur Dave
Closes #2271 from ankurdave/SPARK-3400 and squashes the following commits:
10c2a97 [Ankur Dave] [HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD
zipPartitions"
(cherry picked from commit 00362dac976cd05b06638deb11d990d612429e0b)
Si
Repository: spark
Updated Branches:
refs/heads/master 6481d2742 -> e5d376801
[SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGraph in PR
#720
PR #720 made multiple changes to GraphGenerator.logNormalGraph including:
* Replacing the call to functions for generating random ver
Repository: spark
Updated Branches:
refs/heads/master e9bb12bea -> 9b225ac30
[SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions
If the users set âspark.default.parallelismâ and the value is different
with the EdgeRDD partition number, GraphX jobs will throw:
java.lang.IllegalArgumentExc
Repository: spark
Updated Branches:
refs/heads/branch-1.0 d60f60ccc -> d47581638
[SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions
If the users set âspark.default.parallelismâ and the value is different
with the EdgeRDD partition number, GraphX jobs will throw:
java.lang.IllegalArgumen
Repository: spark
Updated Branches:
refs/heads/branch-1.1 0c8183cb3 -> ffdb2fcf8
[SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions
If the users set âspark.default.parallelismâ and the value is different
with the EdgeRDD partition number, GraphX jobs will throw:
java.lang.IllegalArgumen
Repository: spark
Updated Branches:
refs/heads/branch-1.0 5481196ab -> d60f60ccc
[SPARK-2981][GraphX] EdgePartition1D Int overflow
minor fix
detail is here: https://issues.apache.org/jira/browse/SPARK-2981
Author: Larry Xiao
Closes #1902 from larryxiao/2981 and squashes the following commit
Repository: spark
Updated Branches:
refs/heads/branch-1.1 7267e402c -> 9b0cff2d4
[SPARK-2981][GraphX] EdgePartition1D Int overflow
minor fix
detail is here: https://issues.apache.org/jira/browse/SPARK-2981
Author: Larry Xiao
Closes #1902 from larryxiao/2981 and squashes the following commit
Repository: spark
Updated Branches:
refs/heads/master 7c9bbf172 -> aa7de128c
[SPARK-2981][GraphX] EdgePartition1D Int overflow
minor fix
detail is here: https://issues.apache.org/jira/browse/SPARK-2981
Author: Larry Xiao
Closes #1902 from larryxiao/2981 and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/master 7c92b49d6 -> 7c9bbf172
[SPARK-3123][GraphX]: override the "setName" function to set EdgeRDD's name
manually just as VertexRDD does.
Author: uncleGen
Closes #2033 from uncleGen/master_origin and squashes the following commits:
801994b [u
Repository: spark
Updated Branches:
refs/heads/master 644e31524 -> 7c92b49d6
[SPARK-1986][GraphX]move lib.Analytics to org.apache.spark.examples
to support ~/spark/bin/run-example GraphXAnalytics triangles
/soc-LiveJournal1.txt --numEPart=256
Author: Larry Xiao
Closes #1766 from larryxiao/1
Repository: spark
Updated Branches:
refs/heads/master 79fe7634f -> 5f7b99168
Graphx example
fix examples
Author: CrazyJvm
Closes #1523 from CrazyJvm/graphx-example and squashes the following commits:
663457a [CrazyJvm] outDegrees does not take parameters
7cfff1d [CrazyJvm] fix example for
Author: ankurdave
Date: Thu Jul 3 07:08:46 2014
New Revision: 1607545
URL: http://svn.apache.org/r1607545
Log:
Correct the GraphX performance comparison graphic
Modified:
spark/images/graphx-perf-comparison.png
spark/site/images/graphx-perf-comparison.png
Modified: spark/images/graphx
Repository: spark
Updated Branches:
refs/heads/master 11ded3f66 -> abea2d4ff
Minor: Fix documentation error from apache/spark#946
Author: Ankur Dave
Closes #970 from ankurdave/SPARK-1991_docfix and squashes the following commits:
6d07343 [Ankur Dave] Minor: Fix documentation error f
Repository: spark
Updated Branches:
refs/heads/master e8d93ee52 -> 5284ca78d
Enable repartitioning of graph over different number of partitions
It is currently very difficult to repartition a graph over a different number
of partitions. This PR adds an additional `partitionBy` function that
Repository: spark
Updated Branches:
refs/heads/master aa41a522d -> 894ecde04
Synthetic GraphX Benchmark
This PR accomplishes two things:
1. It introduces a Synthetic Benchmark application that generates an
arbitrarily large log-normal graph and executes either PageRank or connected
componen
rks, LPA is perhaps the most elementary,
and despite its flaws it remains a nice and simple approach.
Author: Ankur Dave
Author: haroldsultan
Author: Harold Sultan
Closes #905 from haroldsultan/master and squashes the following commits:
327aee0 [haroldsultan] Merge pull request #2 from ankurd
64 matches
Mail list logo