Marko A. Rodriguez created TINKERPOP-1214:
---------------------------------------------
Summary: A clone-based threading model for Gremlin OLTP
Key: TINKERPOP-1214
URL: https://issues.apache.org/jira/browse/TINKERPOP-1214
Project: TinkerPop
Issue Type: Improvement
Components: process
Affects Versions: 3.1.1-incubating
Reporter: Marko A. Rodriguez
While a {{Traversal}} is NOT thread-safe, it is possible to clone Traversals
and have them interact via barriers. This is what was recently accomplished in
3.2.0-SNAPSHOT and allows us to go OLTP->OLAP->OLTP->etc...
We can use this same principle to thread a purely OLTP traversal. Lets say you
have 4 threads to execute the traversal. Well, you create 1 "master traversal"
and 3 "clone traversals."
The clone traversals are on their own thread that is simply doing
"while(next())". When there is a barrier reached, the barrier folds into the
"master traversal". The master traversal the redistributes results after the
barrier step back to the individual threads.
It would look something like this:
{code}
/--out.out.out.groupCount-\
/--out-\
g.V--out.out.out.groupCount-- groupCount.select(keys).unfold--out-- .next()
\--out.out.out.groupCount/
\--out-/
{code}
In short, thread when you have a non-barrier sequence (just like OLAP) and when
there is a barrier, merge/reduce.
I don't know how efficient this would be, but I do know the TinkerGraph OLAP
tests are faster than the OLTP tests -- and that is nasty logic involving
VertexProgram semantics. This logic would be much simpler.
Next, we could make threading a strategy as such:
{code}
g = graph.traversal().withParallelization(5)
{code}
Finally, note that this might "just work" in OLAP too. In the sense that the
traversals at each worker in an OLAP job would be threaded!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)