[
https://issues.apache.org/jira/browse/TINKERPOP-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275054#comment-15275054
]
ASF GitHub Bot commented on TINKERPOP-1295:
-------------------------------------------
GitHub user twilmes opened a pull request:
https://github.com/apache/incubator-tinkerpop/pull/307
TINKERPOP-1295 Precompile ScriptInputFormat scripts once during
initialization of ScriptRecordReader
This update precompiles an input script and then reads the input file using
the compiled script instead of repeatedly calling the engine.eval(). This
should cut down on the time spent repeatedly eval-ing the input script.
I ran a quick and dirty benchmark on my measly macbook with
SparkGraphComputer, 2 workers. `g.V().count()` on a test file with 250,000
vertices and a simple `.groovy` script to read it in.
Average of 10 runs
-------------------------
before (TP 3.2.0 - engine.eval): 14975.7 ms
after (TP-1295 - w/ compiled script): 10163.6 ms
`mvn clean install` success
VOTE: +1
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1295
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-tinkerpop/pull/307.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #307
----
commit be6f8d79f606cc0a39dd45457e26328dd31e5636
Author: Ted Wilmes <[email protected]>
Date: 2016-05-06T19:54:54Z
Precompile scripts during ScriptRecordReader initialization.
----
> Precompile ScriptInputFormat scripts once during initialization of
> ScriptRecordReader
> -------------------------------------------------------------------------------------
>
> Key: TINKERPOP-1295
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1295
> Project: TinkerPop
> Issue Type: Improvement
> Components: hadoop, io
> Affects Versions: 3.2.0-incubating, 3.1.2-incubating
> Reporter: Ted Wilmes
> Assignee: Ted Wilmes
> Attachments: intern.svg
>
>
> The {{ScriptRecordReader}} evals scripts on every {{nextKeyValue()}}. I
> think we can cut down on script execution evaluation time by precompiling the
> input script once. This should speedup bulk loads. I've attached some
> profiling info showing a large chunk of time being spent on this eval during
> a recent test run.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)