-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55813/
-----------------------------------------------------------
Review request for atlas, Madhan Neethiraj and Suma Shivaprasad.
Bugs: ATLAS-1403
https://issues.apache.org/jira/browse/ATLAS-1403
Repository: atlas
Description
-------
Currently DSL uses a fill function during Gremlin Translation to merge results
by typeName and superTypeName and fill function loads the resulting vertices in
memory. This causes significant memory usage and ATLAS server spends lot of
time doing GC instead of useful work resulting in OOO sometimes ( when GC is
not able to recover and search queries are run in parallel)
The proposal is to replace this with typeName checks along by finding all the
subtypes for a given type and using an IN clause in the filter.
For eg:
Query = Person where (birthday < "1950-01-01T02:35:58.440Z") limit 40 offset 0
Optimized query
Gremlin Query = L:
{g.V.has("__typeName", T.in,
['Person','Manager']).and(_().has("Person.birthday", T.lt, -631142641560))
[0..<40].toList()}
Diffs
-----
repository/src/main/java/org/apache/atlas/discovery/DataSetLineageService.java
fd5dba7
repository/src/main/java/org/apache/atlas/discovery/graph/DefaultGraphPersistenceStrategy.java
266f27c
repository/src/main/java/org/apache/atlas/discovery/graph/GraphBackedDiscoveryService.java
b637f90
repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java
889236c
repository/src/main/scala/org/apache/atlas/query/ClosureQuery.scala daef582
repository/src/main/scala/org/apache/atlas/query/GraphPersistenceStrategies.scala
a9dcdff
repository/src/main/scala/org/apache/atlas/query/GremlinEvaluator.scala
ade4176
repository/src/test/scala/org/apache/atlas/query/GremlinTest2.scala 33513c5
Diff: https://reviews.apache.org/r/55813/diff/
Testing
-------
Ran Unit Tests and was successful.
Thanks,
Sarath Subramanian