On Sun, Sep 23, 2007 at 04:21:58AM -0700, Steve Langasek wrote: > On Fri, Sep 21, 2007 at 01:07:49PM +1000, Anthony Towns wrote: > > On Thu, Sep 20, 2007 at 06:19:10PM -0700, Steve Langasek wrote: > > > So do you have a use case where you think the behavior described in rule 9 > > > *is* desirable? > > Any application written assuming this behaviour, works correctly on > > Windows, Solaris, *BSD and glibc based systems in general, but not > > on Debian. > So my argument here is that I don't believe there *are* any applications > being written that assume this behavior; and that even if there were, such > applications would either work just fine with the previous getaddrinfo() > behavior, or be too pathological to live.
There's two aspects to RFC3484's behaviour: first that it creates a much more stable ordering of its results than could have been expected otherwise, and second it tries to make that ordering more optimal than a random ordering would be wrt routing. Stability is useful for any case where the servers hosting a particular might be out of sync with each other; eg, if stability could be assumed we'd have less errors where an invocation of "apt-get update" chooses one mirror, and a subsequent "apt-get upgrade" chooses a different server that hasn't finished syncing. Hopefully "apt-get" isn't considered "too pathological to live"... Better routing has less direct benefits to the client, probably limited to slightly better ping times, with a small chance of somewhat cheaper bandwidth costs. For the people providing the service, it lets you make better assumptions as to load balancing -- you can expect the servers based in a particular area to be serving a load proportional to the number of users in that area, rather than having the load fairly evenly distributed globally. Of course, there are other ways of doing this that don't rely on how the client's resolver is implemented. Of course, if the routing is worse, those turn into drawbacks instead of benefits. > Instead, taken over the whole Internet rule 9 is statistically a > pseudo-randomization relative to the *correct* sorting[1], If that were the case it would be no worse than round-robin selection of preferred address. You can only take it over the whole Internet if you're assuming an equal distribution across all IPs, which isn't valid for IPv4 (where there's presumably a significant bias to private IPs), and presumably isn't valid for any particular service, which will be heavily biassed to particular IP ranges by correlation with location or language... > One of the existing use cases that breaks is round-robin DNS. Round-robin DNS isn't broken; the expectation of (approximately) equal load-distribution across all servers in a round-robin is broken. > They might be reasons why RR DNS would be > an acceptable sacrifice in favor of other beneficial features, but rule 9 as > written offers *no* benefits in the general case! Even without the possibility of applications like apt-get benefiting from stability of results, I don't think we've done anywhere enough of a review to be declaring that there aren't any benefits to rule 9. > So I don't see that much weight should be given to whether other operating > system vendors choose to comply with a rule which is, fundamentally, > misguided and broken. As far as I can see, for rule 9 to be fundamentally misguided and broken, the concept of providing a stable answer, or a better than random ordering, would need to be harmful. If they're beneficial, even in some cases, then we've got a problem in the details of the specification, not a fundamental issue. (Note that prefix matching is the only reordering rule that has any effect in almost all actual cases, so without that rule or a replacement, both stability and any improvements in routing disappear) Note that stability isn't definitively a good thing -- if the first server you connect to happens to be the only one that's down/unreachable, then with a stable resolver you need to have specific failover code to use a different address; whereas if you can expect gethostbyname() to return a different first result, you can just rerun the program. > Furthermore, even if gethostbyname() has been deprecated in POSIX, it's > relevant that there is still plenty of software in Debian that uses this > interface[1]. Almost all of this software is going to be IPv4-only; if we > want Debian to be fully IPv6-capable, these are programs that will need to > be updated to use the getaddrinfo() interface, at which point they will > cease to work correctly with round-robin DNS in the absence of additional > code to re-randomize addresses(!). Uh, round-robin DNS isn't a guarantee that any individual client will get different or randomised results -- and the argument that round-robin won't break anything that relies on rule 9 goes the other way too. Further, having getaddrinfo() behave differently for IPv4 and IPv6 isn't completely helpful in making Debian support IPv6 -- if we change a program from gethostbyname() to getaddrinfo() under the assumption they behave the same way and that's fine (for IPv4), but getaddrinfo() for the particular app for IPv6 requires extra randomisation or the addition of fail-over code to work sensibly, we're not done. > The more work that is needed to make an > IPv4 application function correctly with both IPv4 and IPv6, the less likely > it is that this work will get done; Exactly like that. > > As it happens I largely agree with that. I don't agree with making a > > decision to go against an IETF standard and glibc upstream lightly, > > though, no matter how many caps Ian expends repeating that it's at the > > least mature level of Internet standard. If it's also the case that > > the RFC-specified behaviour is a de facto standard amongst other OSes, > > as the above seems to indicate, then that's even more reason to make > > sure we have a clear decision backed up by good, clear reasoning. > Yes, this isn't a decision to make lightly, but I believe the reasoning I've > offered above is sound. It's sound to a point, but we haven't established an understanding of what common behaviour on today's internet actually is, we haven't done any review of how getaddrinfo() is being used outside of our own experiences, and we haven't made any examination of how we can achieve the apparent goals underlying the proposed standard without the drawbacks that are concerning us. > [1] where the correct sort order requires detailed knowledge of global > route tables and is therefore not even remotely feasible to implement on any > resolver hosts other than those with a view into BGP Uh, no, it requires a knowledge of *local* route tables -- once you get far enough away from the client, both time and cost start getting ridiculous. It's feasible to do it both by having a table of cheaper or closer prefixes [0], or by doing some experimental checking along the lines of netselect. BGP views are helpful if you're trying to decide routing from the server side, but this is approaching the problem from the opposite side. > [2] on my etch workstation: > $ for prog in /usr/bin/*; do nm -Du $prog 2>/dev/null \ > | grep -q gethostbyname && echo $prog; done | wc -l > 257 On my lenny laptop: for prog in /usr/bin/*; do nm -Du $prog 2>/dev/null \ | grep -q gethostbyname && echo $prog; done | wc -l 110 for prog in /usr/bin/*; do nm -Du $prog 2>/dev/null \ | grep -q getaddrinfo && echo $prog; done | wc -l 62 I'm not sure what that proves -- [2] was an unreferenced footnote, afaics... Cheers, aj [0] For example, my ISP has one at: http://www.internode.on.net/content/unmetered/ip-list.htm
signature.asc
Description: Digital signature