[
https://issues.apache.org/jira/browse/TINKERPOP-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17676776#comment-17676776
]
Cole Greer commented on TINKERPOP-2767:
---------------------------------------
I've narrowed down the underlying issue here. This traversal is triggering a
stack overflow on the server which is not being handled very well. Essentially
the following traversal is big enough that when it gets expanded out by the
server, it will blow the stack. (Might need to increase the 'times' amount to
reproduce depending on your server environment)
{code:java}
g.with("evaluationTimeout", 5000).V(1).repeat(__.out()).times(3000){code}
After the initial stack overflow, the JVM seems to allocate more memory which
allows a repeated attempt at this traversal to succeed.
It is noteworthy that if the original request is sent as gremlin script, the
stack overflow exception will surface in the server's logs, and the server will
send that exception to the client. I've tested this with the console, java
driver, and go driver and all of them do a reasonable job at handling this
exception, informing the user, and recovering (although all of them approach
this somewhat differently).
If the original request is sent as bytecode however, the same stack overflow
error is still thrown but it is being swallowed somewhere in the server and it
never surfaces. The server will stop processing that one traversal and continue
to function normally. The server however does not send any response to the
client, which is what causes the driver to hang indefinitely. I have seen this
with the example script in go that Simon provided as well as this equivalent
example in java:
{code:java}
GraphTraversalSource g =
traversal().withRemote(DriverRemoteConnection.using("localhost",8182,"g"));
g.addV("test").property("id", 1).next();
g.addV("test").property("id", 2).next();
g.addE("test").from(__.V(1)).to(__.V(2)).property("id", "e1");
g.addE("test").from(__.V(2)).to(__.V(1)).property("id", "e2");
var result = g.with("evaluationTimeout",
5000l).V(1).repeat(__.out()).times(3000).next();
System.out.println(result); {code}
> Repeat Out Times traversal hangs indefinitely on first execution
> ----------------------------------------------------------------
>
> Key: TINKERPOP-2767
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2767
> Project: TinkerPop
> Issue Type: Bug
> Components: javascript
> Affects Versions: 3.5.3
> Environment: Windows 10
> Reporter: Simon Zhao
> Priority: Major
>
> Originally encountered when fixing TINKERPOP-2754
>
> The following traversal in JS seems to cause hanging the first time you run
> it on a newly launched gremlin-server (3.5.3) via docker
>
> {{await g.V('1').repeat(_.out()).times(1500).next();}}
>
> The same hanging occurs in gremlin-go.
>
> {code:java}
> _, err = g.With("evaluationTimeout",
> 1000).V("1").Repeat(gremlingo.T__.Out()).Times(int32(1500)).Next() {code}
>
> The timeout is optional, but indicates that something is going wrong since it
> is not returning. Interestingly enough, if the timeout is very low, then it
> won't hang because it will say the timeout was exceeded. This indicates that
> if the traversal is completed within the timeout, it's just not returning for
> some reason on the first call.
>
> If you were to write a script and invoke this snippet of code, it will hang.
> If you forcefully terminate the script and rerun it, then it doesn't hang.
>
> main.go
> {code:java}
> package main
> import (
> gremlingo "github.com/apache/tinkerpop/gremlin-go/v3/driver"
> "log"
> )
> func main() {
> driver, err :=
> gremlingo.NewDriverRemoteConnection("ws://localhost:45940/gremlin")
> if err != nil {
> log.Print("Err creating DRC")
> return
> }
> defer driver.Close()
> log.Println("Start")
> g := gremlingo.Traversal_().WithRemote(driver)
> LABEL := "test"
> _, err = g.V().HasLabel(LABEL).Drop().Next()
> _, err = g.AddV(LABEL).Property(gremlingo.T.Id, "1").Next()
> _, err = g.AddV(LABEL).Property(gremlingo.T.Id, "2").Next()
> _, err =
> g.AddE(LABEL).From(gremlingo.T__.V("1")).To(gremlingo.T__.V("2")).Property(gremlingo.T.Id,
> "e1").Next()
> _, err =
> g.AddE(LABEL).From(gremlingo.T__.V("2")).To(gremlingo.T__.V("1")).Property(gremlingo.T.Id,
> "e2").Next()
> if err != nil {
> log.Println("Error during setup")
> return
> }
> log.Println("Start the problematic traversal")
> _, err = g.With("evaluationTimeout",
> 1000).V("1").Repeat(gremlingo.T__.Out()).Times(int32(1500)).Next()
> if err != nil {
> log.Println("Error with the problematic traversal, but we didn't hang")
> return
> }
> log.Println("End")
> } {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)