[
https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007924#comment-16007924
]
David Radley commented on ATLAS-1690:
-------------------------------------
Hi [~madhan.neethiraj] and [~cmgrote],
This is interesting, it seems to me that if we are going to embrace using tags
in context specific ways and we feel tag propagation decisions should not be
made by the relationship author then we do not want to hard bake tag
propagation into the Atlas store. I am hearing that you feel there are use
cases where the relationship author or updater would be responsible for tag
propagation.
Can I just check how you are thinking about the tag propagation implementation
in your scenarios :
1) when a table/column is classified as PII, any lineage from this
table/view/column should also be automatically be classified as PII. This means
we need a relationship between an entity defined attribute to another entity
defined attribute in the other entity. The current proposal does not allow this
as the relationship specifies attribute names for each end, that are not
defined in the entity. The current relationship proposal adds a new attribute
of type 'the other end'. I think we would need a new tag propagating
relationship - or more generally a mapping.
2) when a term is classified as PII, all entities that are associated with the
term also should automatically be classified as PII. So I am thinking that an
asset / entity will have an assigned terms attribute. The GlossaryTerm would
have a assignedEntities attribute. These attributes would be added by virtue of
the relationship. If the Glossary Term was tagged PII, we need special Atlas
logic so assignedEntities propagates to the entities assignedTerms - then
special logic for the assignedTerms tag is picked up by the entity itself.
3) when a term is classified as PII, all terms that are synonyms of this term
(and all the entities associated with the synonym terms) also should
automatically be classified the same. We would need special Atlas logic for the
synonym case that propagates the PII tag from the synonym relationship to the
terms themselves.
In summary I see 2 main tag propagation scenarios, and propose a way forward
for each:
- mapping between existing entity attributes, that could be used as the basis
of propagation. I suggest we introduce the top level concept of a mapping
separately from this relationships Jira.
- the need for additional logic in Atlas to propagate tags across glossary term
to term or term to asset relationships. I can see this is useful. So I suggest
that I add in tag_propagation_hint enum including NONE. The actual propagation
will depend on specific relationship types. When we implement the glossary, we
will use this hint to propagate the tags in glossary specific Atlas logic.
Further enhancements can occur as we bring in logic around the classification
level use case and enhanced Ranger integration; Ranger will need to be able to
override the hinted classification. Outside of the glossary case - tags will
not be propagated - even if the tag-propagation-hint is set on a relationship.
Give this thinking, it makes sense for the tag_propagation_hint to be on only
on the Glossary Relationship types and not on the top level relationships.
Madhan and [~mandy_chessell] does this make sense?
> Introduce top level relationships
> ---------------------------------
>
> Key: ATLAS-1690
> URL: https://issues.apache.org/jira/browse/ATLAS-1690
> Project: Atlas
> Issue Type: Improvement
> Reporter: David Radley
> Assignee: David Radley
> Labels: VirtualDataConnector
> Attachments: Atlas_RelationDef_Json_Structure_v1.pdf, Atlas
> Relationships proposal v1.0.pdf, Atlas Relationships proposal v1.1.pdf, Atlas
> Relationships proposal v1.2.pdf, Atlas Relationships proposal v1.3.pdf, Atlas
> Relationships proposal v1.4.pdf, Atlas Relationships proposal v1.5.pdf, Atlas
> Relationships proposal v1.6.pdf, Atlas Relationships proposal v1.7.pdf
>
>
> Introduce top level relationships including support for
> -many to many relationships
> - relationship names including the name for both ends and the relationship.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)