[
https://issues.apache.org/jira/browse/ATLAS-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924301#comment-15924301
]
Sharmadha Sainath commented on ATLAS-1661:
------------------------------------------
[~ayubkhan]
>> I believe the intent of import-hive.sh tool is not to track all the metadata
>> changes but to take a snapshot of the metadata at that point of time.
I completely agree with you. That is the intent. Currently import-hive script
looks for the qualified name of the table if it is already present , and
updates the same . In the case mentioned in the description , it doesn't find
the table tablenew so it creates a new table. But it would be a good to have
feature if import hive script could have a mechanism to know the history of the
tablenew and update accordingly.
>> Are you suggesting to have the hiveHook capability built into import-hive.sh
>> tool also?
Yes , because that would be the expectation from the customer. Only difference
customer would know is , hive hook updates as and when query is fired , and
import hive script does bunch update when run.
> import hive script to handle updates like rename/delete
> -------------------------------------------------------
>
> Key: ATLAS-1661
> URL: https://issues.apache.org/jira/browse/ATLAS-1661
> Project: Atlas
> Issue Type: Improvement
> Components: atlas-intg
> Reporter: Sharmadha Sainath
> Priority: Minor
>
> 1. Disabled hive hook
> 2. Created table table1
> 3. Ran import-hive.sh script , Atlas ingested table1.
> 4. Altered table table1 , rename to table1_new.
> 5. Ran import-hive.sh script , Atlas created a new table table1new .
> table1 wasn't updated with new name.
> This is the expected behavior with import-hive script as opposed to hive
> hook, as hive hook is synchronous and import-hive is not.
> But as a customer , running import-hive.sh multiple times and doing many hive
> operations may result in inconsistency while applying ranger policies to the
> table and in many scenarios , since it is not documented to run import hive
> script only once.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)