[
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561363#comment-16561363
]
Adrian Cole commented on HADOOP-15566:
--------------------------------------
TL;DR; I would advise evaluating all the options, perhaps by resurrecting a
small part of htrace in order to give a more seamless migration and support
path. This allows *sites* to participate in the decision making *before
committing to an approach.*
Depending on choices permitted, this might imply api or model changes to make
it work.. doing this decoupled from hadoop moves the thrash to where it belongs.
Ironically, while at twitter the data services team preferred htrace to zipkin,
eventhough zipkin was there. It would be nice to both have a focus on brown
field, like a solution that works with today and tomorrow. *Many won't upgrade
hadoop for many years* to 3.1. Sites should be preferred and deferring input
from them, we should try to act on their behalf... saying again thrash behind
the api before considering thrashing an api.
Resurrecting the "api" part would also allow a less conjectured guide to moving
forward, one that has to firstly tackle concerns technically, such as parents.
It is easy to say how something might work and another thing entirely to have
it work, and have it work efficiently, and have it work in ways that are safe.
Doing this buys more time to make informed decisions, have people who have
never worked on data systems a chance to get that experience first. Even in
services tracing, we've noticed a lot of things left to end users to sort out..
seems data services should have even more rigor.
For example, HTrace code includes a lot of guards that prevent excess network
communication. These things are inconsistent across OT as threading concerns
are an implementation detail, there is neither a spec nor TCK on reporting,
except some guidance to be good. Census one could conjecture would be good for
hadoop if it is good for google internally with bigtable. However, even that
shouldn't be left to conjecture. Many ecosystems have a fair amount of full
time staff, and possibly could use those staff towards vetting of the concerns
already implemented by the htrace libraries.
Anyway I hope this response is not ruffling feathers.. I've tried hard to not
have it do such. While less qualified than some to participate in this
discussion, you can look at the source history and otherwise. I have personally
fixed code here and elsewhere to make interop work. I also collaborated with a
site owner to open up the transports. I primarily take care of the openzipkin
volunteer community even if I am paid a salary. I don't make any more or less
money if hadoop chooses one thing vs another.
> Remove HTrace support
> ---------------------
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
> Issue Type: Improvement
> Components: metrics
> Affects Versions: 3.1.0
> Reporter: Todd Lipcon
> Priority: Major
> Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png,
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making
> further releases. The Hadoop project currently has various hooks with HTrace.
> It seems in some cases (eg HDFS-13702) these hooks have had measurable
> performance overhead. Given these two factors, I think we should consider
> removing the HTrace integration. If there is someone willing to do the work,
> replacing it with OpenTracing might be a better choice since there is an
> active community.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]