[
https://issues.apache.org/jira/browse/HADOOP-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205748#comment-16205748
]
Steve Loughran commented on HADOOP-14876:
-----------------------------------------
h4. Privacy scope
* add: sometimes things are marked as private when they end up being essential
(example: UserGroupInformation). In situations, raise issues with the team to
see if we can't add some form of @public tag \cite{HADOOP-10776}.
Now, what about the fact that distributed shell example has (or at least did
when I last looked) use of private code?
e.g org.apache.hadoop.io.DataOutputBuffer, the timeline plugin,
NMClientAsyncImpl, ... You can look at the imports and probably 20% of the
class imports (not interfaces, yarn records) are tagged as private/limited
private. We are not in a position to tell people not to use @Private, not given
we consider doing so essential even for basic example yarn apps.
* What does it mean if something is tagged as LimitedPrivate for one app (esp
HBase & Hive, which aren't within our own codebase)? to me, that says "we know
these things get used downstream, or we've added them as a special secret
back-door". But who gets to choose which apps can actually use it? Limited
private+outside our codebase == public, which is something we should
acknowledge when scoping things. And LimitedPrivate(Mapreduce) often means
"every YARN app needs these".
* What does it mean if a release removes/changes something you depended on
which was tagged private/limited private. Complain. It may get ignored, but it
may have been done without awareness of wide use.
h4. Semantics
I take this bit very seriously, having been deeply involved in the original
paragraphs, and an aficionado of all D.L. Parnas's writings on the notion of
"interface". As far as I'm concerned, the defacto definitions of semantics are
defined in our unit tests "what we expect" and in those of widely used
applications "what HBase and Hive expect". We know if we break the latter then
people complain, and, while we may do so, its not something want to. B
L113. yeah right. It's usually the first port of call, & if you think
otherwise, you're not writing enough downstream code.
the original Compatibility.md calls out that some bits of the system have
non-normative specifications; eg fileystem. I would consider that significantly
more normative than the javadocs, most of which are vague aspirations of
functionality. Usually the javadocs don't have any mention of concurrency,
which matters a lot; for that you do end up delving into the source and/or
using it in a way which appears to work (HDFS's use of input streams), when in
fact they'r just using accidental bits of the semantics which we are now
expected to maintain.
+maybe mention StreamCapabilities.hasCapability as a way of determining if FS
streams offer a feature, say it's more to support variants in back ends rather
than a way for us to remove things. But do mention: good practice to check for
new things rather than assume that if HDFS implements it, it works everywhere.
L160" The audit log format may not change incompatibly between major releases."
?? "may change?" or "must not"
L189. Need to explain how to differentiate log chaff from "real" output.
Indeed, I'm curious myself.
L208. We don't require log4j though; other back ends may be supportable.
L229. Nothing called "s3.*: no more
L298 "No new (exposed) dependency will be added to Hadoop between major
releases."
Can't make that guarantee. Qualify "via the shaded clients"
Things that we've glossed over
* No statement on supported operating systems, filesystems, x86 parts IPv4 vs
v6, If I code for Windows, how long will hadoop-client work there? What if I
target SPARC?
* Concurrency: say "we try not to make things worse"?; degradations are
considered defects except when its just some accidental side effect of
excessive logging?
> Create downstream developer docs from the compatibility guidelines
> ------------------------------------------------------------------
>
> Key: HADOOP-14876
> URL: https://issues.apache.org/jira/browse/HADOOP-14876
> Project: Hadoop Common
> Issue Type: Improvement
> Components: documentation
> Affects Versions: 3.0.0-beta1
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
> Priority: Critical
> Attachments: HADOOP-14876.001.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]