[
https://issues.apache.org/jira/browse/TINKERPOP-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197322#comment-15197322
]
ASF GitHub Bot commented on TINKERPOP-1206:
-------------------------------------------
GitHub user okram opened a pull request:
https://github.com/apache/incubator-tinkerpop/pull/266
TINKERPOP-1206 & TINKERPOP-1014: Bulk Traversers and Requirements
Optimization
https://issues.apache.org/jira/browse/TINKERPOP-1206
https://issues.apache.org/jira/browse/TINKERPOP-1014
Two tangentially related issues are solved in this PR. The core of Gremlin
is `Step`. We have this notion of an `ExpandableStepIterator` which knows
either to get local traversers or traversers from the previous step. We use to
have it where if you `addStarts(Iterator<Traverser>)`, it would add them to
their own `MultiIterator`. This was pointless cause we don't get the benefit of
bulking. Not only do we simplify the logic of this fundamental class, but we
also reduce the amount of objects it creates/maintains and we all increase the
likelihood of bulking (which is the ultimate potential optimization). The
consequence is that we no longer support `O_Traverser` as bulking is ALWAYS
required. This is fine -- very few traversals benefited from `O_Traverser` and
all they gained was no `long` bulk field (trivial). `O_Traverser` is now
abstract.
This got me into thinking about how we calculate requirements and I
realized how insanely stupid we (me) were especially for `by()`-modulators
where EVERY call to a local traversal (typically `by()`-traversals) were
recalculating the requirements!!!! That is, a full walk through the traversal
tree. I decided to make `TraverserGenerator` static to the root traversal and
when it is needed (the first time), it calculates the requirements and locks in
the generator. Likewise, for child traversals, they look to the root traversal
for the generator and lock it in locally. Thus, all this notion of requirements
being expensive just vanished. We should keep our "traverser species" as
calculating requirements is now cheap.
CHANGELOG
```
* Optimized `ExpandableStepIterator` with simpler logic and increased the
likelihood of bulking.
* Optimized `TraverserRequirement` calculations.
* `Step.processNextStart()` and `Step.next()` now return
`Traverser.Admin<E>`. (*breaking*)
* `Traversal.addTraverserRequirement()` method removed. (*breaking*)
```
UPDATE
```
Step API Update
^^^^^^^^^^^^^^^^
The `Step` interface is fundamental to Gremlin. `Step.processNextStart()`
and `Step.next()` both retuned `Traverser<E>`. We had so many
`Traverser.asAdmin()` and direct typecast calls throughout (especially in
`TraversalVertexProgram`) that it was deemed prudent to have
`Step.processNextStart()` and `Step.next()` return `Traverser.Admin<E>`.
Moreover it makes sense as this is internal logic where admins are always
needed. Providers with their own step definitions will simply need to change
their method signatures (no logic update is required -- save that `asAdmin()`
can be safely removed if used). This is a trivial update to make.
Traversal API Update
^^^^^^^^^^^^^^^^^^^^
The way in which `TraverserRequirements` are calculated has been changed
(for the better). The ramification is that post compilation requirement
additions no longer make sense and should not be allowed. To enforce this,
`Traversal.addTraverserRequirement()` method has been removed from the
interface. Moreover, users should never be able to add requirements manually
(this should all be inferred from the end compilation). However, if need be,
there is always `RequirementStrategy` which will allow the user to add a
requirement at strategy application time (though again, there should not be a
reason to do so).
```
VOTE +1.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1206
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-tinkerpop/pull/266.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #266
----
commit 5b4009d6418e818066d72c474123df1fbc081478
Author: Marko A. Rodriguez <[email protected]>
Date: 2016-03-16T02:49:55Z
I've redesigned how TraverserRequirements are accessed. This GREATLY
reduces the amount of logic required for all by()-modulators, StartSteps,
GraphSteps, and barriers. I'm talking CRAZY more efficient when by()-modulators
are used. I've made it so the Step API works with Traverser.Admin. We had too
many .asAdmins() everywhere and typecasting everywhere -- no bueno. So much
more clean now. For providers of Steps, breaking -- but its simply just
changing your processNextStart() signature to return a Traverser.Admin. The
amount of logic removed in this push is insane. This covers TINKERPOP-1206 and
partially TINKERPOP-1014. The model I used for getTraverserGenerator() is
really smart and we might want to use it for getSideEffects() and
getStrategies(). In short, lazy computation of these data structures as they
are rarely globally needed. This also means we can pretty signifanctly reduce
the size of a Traversal serialization which is good for RemoteGraph. Finally
found a major bug in GryoMapper -- registered XXXGeneartor, not XXX (for one
species). Sucky.
commit 90873e3cd640f45fd8872a0c19406bdfd1f3dc2a
Author: Marko A. Rodriguez <[email protected]>
Date: 2016-03-16T13:13:32Z
make the requirements unmodifiable for safety.
----
> ExpandableIterator can take a full TraverserSet at once -- Barriers.
> --------------------------------------------------------------------
>
> Key: TINKERPOP-1206
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1206
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.1.1-incubating
> Reporter: Marko A. Rodriguez
>
> I haven't looked at {{ExpandableIterator}} in over a year. Its one of the
> most fundamental structures of a Gremlin traversal. I just realized it can
> take an entire {{TraverserSet}}. As such, if the previous step is a
> {{Barrier}}, don't iterate the barrier out, simply "dump it" into the current
> steps {{ExpandableIterator}}. That should speed up things significantly --
> though there are not that many barrier steps... but still.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)