Marko A. Rodriguez created TINKERPOP-1167:
---------------------------------------------
Summary: ParallelVertexProgram as a MetaVertexProgram
Key: TINKERPOP-1167
URL: https://issues.apache.org/jira/browse/TINKERPOP-1167
Project: TinkerPop
Issue Type: Improvement
Components: process
Affects Versions: 3.1.0-incubating
Reporter: Marko A. Rodriguez
When I wrote down this query, I realized something.
{code}
gremlin> g.V().
peerPressure().by('cluster').by(outE('knows')).
pageRank(1.0).by('friendRank').by(outE('knows')).times(1).
group().by('cluster').by(values('friendRank').sum());
==>[1:1.0, 3:0.0, 5:0.0, 6:0.0]
{code}
I realized that {{PeerPressureVertexProgram}} and {{PageRankVertexProgram}} can
be run in parallel as their messages don't interfere with one another. Lets say
both do 30 iterations. Instead of executing for 60 iterations, a
parallel-execution would run for 30 iterations. Now, lets say one terminates
before the other. Who cares, it just holds after terminate.
I don't know how to implement this, but something like:
{code}
public class ParallelVertexProgram implements VertexProgram<List<Object>>
{code}
What is the {{List<Object>>}} message? Well, if there are 3 vertex programs
inside it, then {{list.get(0)}} are the messages for the first vertex program
at that iteration, {{list.get(1)}} are the messages for the second... etc.
CraZy. This gets back to an old ticket about "Traverser Swarms" ... We can tag
the messages and just have a "permanately running GraphComputer" that just gets
programs put into it, they are churned out, and returned when they respectively
terminate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)