Tom Chamnongvongse created LIVY-1003:
----------------------------------------
Summary: Interactive session - Setting large value of
rsc.server.connect.timeout blocks other tasks
Key: LIVY-1003
URL: https://issues.apache.org/jira/browse/LIVY-1003
Project: Livy
Issue Type: Bug
Components: RSC
Affects Versions: 0.8.0
Reporter: Tom Chamnongvongse
Problem:
Livy is configured to deploy interactive sessions on YARN with
`livy.rsc.server.connect.timeout` configure to a high value. Timeout is
increased to allow more time for Livy session to be in YARN `ACCEPTED` state to
prevent Livy server from killing the YARN app within the default timeout of 90
seconds.
Until the app is in YARN `RUNNING` state, it takes up a thread in Scala's
global execution context -
https://github.com/apache/incubator-livy/blob/v0.8.0-incubating/server/src/main/scala/org/apache/livy/server/interactive/InteractiveSession.scala#L474.
Creating too many of these sessions that are stuck in `ACCEPTED` state causes
other tasks that use that global execution context to be queued up.
How to reproduce:
1. Set `livy.rsc.server.connect.timeout` to something high like 1 hour.
2. Create enough interactive livy sessions in YARN so that they are queued in
ACCEPTED state. The number of sessions that are stuck in ACCEPTED state should
be equal to global execution context [thread pool
size|https://docs.scala-lang.org/overviews/core/futures.html#the-global-execution-context]
(Runtime.availableProcessors)
3. Try to delete a session using DELETE /sessions/{sessionId} and it should
hang until one of the sessions is no longer stuck in ACCEPTED state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)