[ 
https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978467#comment-15978467
 ] 

Mandy Chessell commented on ATLAS-1690:
---------------------------------------

Hello David,
Comments follow based on V1.4

Pg2 "Metadata repositories store metadata. The context of a metadata object is 
dictated by its relationships."  I am not sure these sentences tell the 
complete story.  Maybe something like "The Apache Atlas metadata repository 
stores metadata objects and their relationships.  The relationships between the 
metadata objects are as important as the metadata objects themselves.  They 
explain how the data landscape is structured and how the components within it 
relate to the business and the governance requirements, ownership and other 
interested parties.   The relationships in Apache Atlas today provide support 
for containment (or part-of) relationships.  This is necessary to describe 
sub-components of a component - for example, a Hive Column is a sub-component 
of a Hive Table.  With these types of relationships, the lifetime of the 
sub-components is tied to their parent component.  So for example, if a hive 
table is deleted, then all of its columns should also be deleted.  This design 
is looking to add support for a new type of relationship between metadata 
objects that have independent lifetimes.  In fact the creation of these 
relationships are actually an auditable action that can impact how data is 
discovered, understood, secured, managed and removed.  Such relationships 
include when Glossary ..."

Pg2  "If these links are made incorrectly (purposely or otherwise) data can be 
inappropriately exposed." This comment is out of place - it is only true if the 
relationship is involved in access control.  A more general comment could be 
"If these links are made incorrectly (purposely or otherwise) data may be 
inappropriately used or governed."

Font of JSON example on page 4 is inconsistent - harder to read than necessary.

pg5 - "Relationship constraints" - first time mentioned this term - should be 
introduced in examples above.

pg5 - "This name will help us name an association and its associated
classification."  Not sure what classification means in this sentence.  Also 
need a description of why an association needs a name (I am thinking of this as 
a Type name - is that right?   The name is important because the creation of 
these types of relationships are a deliberate act of governance and we need to 
be able to describe their use - and govern their lifecycle.

pg 5 "“Address” and “Person”; a person has addresses, and addresses have people 
living in them. In this case, there is no obvious direction, so a bidirectional 
relationship is natural way of associating these concepts; the alternative 
would be 2 directional relationships that would not be kept in sync."  Please 
use a metadata description - this is confusing to talk about data relationships.

pg 5 "There are 2 main styles of relationships, tight and loose relationships." 
 Why have new names been for these when at the top the doc states it is using 
UML names?  Also the names are misleading.  There is nothing loose about the 
association between a glossary term and a database column.  

pg 6 "In the case of tight relationships, the top entity and its children are 
governed as one, as the lifecycles of the children are tied to the parent. "  
It is true that the lifecycles are linked but it does not mean the governance 
is tied - for example, the confidentiality classification of a table may be 
different from the different columns it is made up of.  Governance rules may be 
defined to act on specific columns and not on a table as a whole.

pg 6/7 - RelationshipDef example - please use metadata examples not data 
examples - it is confusing because you would never define types for customer 
and account in Atlas.

pg7 "The entity instances use Atlas object ids pointing to the relationship 
instance (which has a guid)."  This needs further explanation and an example.

pg 8 "Read" - what are the parameters on read - is this a single relationship 
operation?

pg 8 "Aggregation implies that here is containment "  I know what you mean but 
aggregation and containment are different things in UML and so this statement 
is not logically correct.

p8 "A natural way to specify aggregation would be to have an isContainer 
Boolean flag, defaulting to false and specified on one of the endpoints in the 
relationship."  Should say this flag can only be set on one end.  

pg8 - aggregations example - please use metadata example such as category to 
term

pg8 - observations - a relationship described by a relationshipDef can not be 
manatory.  The isOptional flag is obsolete.  Can we remove it?  Where a 
governance action needs two entities to be linked to be functional then this 
needs to be handled by state attributes that it can test.







> Introduce top level relationships
> ---------------------------------
>
>                 Key: ATLAS-1690
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1690
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: David Radley
>            Assignee: David Radley
>              Labels: VirtualDataConnector
>         Attachments: Atlas Relationships proposal v1.0.pdf, Atlas 
> Relationships proposal v1.1.pdf, Atlas Relationships proposal v1.2.pdf, Atlas 
> Relationships proposal v1.3.pdf, Atlas Relationships proposal v1.4.pdf
>
>
> Introduce top level relationships including support for 
> -many to many relationships
> - relationship names including the name for both ends and the relationship.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to