Sarath Subramanian created ATLAS-1720:
-----------------------------------------
Summary: Increase titan storage.lock.wait-time for Berkley DB to
fix intermittent IT failures
Key: ATLAS-1720
URL: https://issues.apache.org/jira/browse/ATLAS-1720
Project: Atlas
Issue Type: Bug
Components: atlas-core
Affects Versions: trunk, 0.9-incubating
Reporter: Sarath Subramanian
Assignee: Sarath Subramanian
Some of the ITs in Atlas fail intermittently with exception - "Could not
execute operation due to backend exception"
Upon investigation it's found this is due to Berkley LockTimeoutException
(https://github.com/thinkaurelius/titan/issues/1113)
The default LockTimeout for berkley db is 500 ms and if a thread (some IT) is
waiting on titan storage resource which is locked by another thread and it
doesn't releases the lock within 500ms - fails with above exception. (see error
log below)
The fix for this is to increase the storage.lock.wait-time for berkley db to
10000 ms. This is consistent with the lock wait timeout specified for HBase.
Caused by: com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired.
Locker 1516581475 7535_NotificationHookConsumer thread-0_Txn: waited for lock
on database=edgestore LockAddr:284896285 LSN=0x0/0x21d55f type=WRITE
grant=WAIT_PROMOTION timeoutMillis=500 startTime=1491261268442
endTime=1491261268942
Owners: [<LockInfo locker="1445928922 7537_qtp184901207-1038 -
e015a355-d6c5-4424-b7a7-833a289aea9d_Txn" type="READ"/>, <LockInfo
locker="1516581475 7535_NotificationHookConsumer thread-0_Txn" type="READ"/>]
Waiters: []
Transaction 1445928922 7537_qtp184901207-1038 -
e015a355-d6c5-4424-b7a7-833a289aea9d_Txn waits for LockAddr:471572402
Owners:<LockInfo locker="1516581475 7535_NotificationHookConsumer thread-0_Txn"
type="WRITE"/> Waiters:[<LockInfo locker="1445928922 7537_qtp184901207-1038 -
e015a355-d6c5-4424-b7a7-833a289aea9d_Txn" type="READ"/>]
Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn owns
LockAddr:471572402 <LockInfo locker="1516581475 7535_NotificationHookConsumer
thread-0_Txn" type="WRITE"/>
Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn waits for
LockAddr:284896285
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)