On 17. 07. 25 12:08, John Levine wrote:
It appears that Libor Peltan  <[email protected]> said:
The protocol is broken right now. The authoritative operators/vendors
are not instructed (or even warned) to do anything with keytag
conflicts, and some major resolver vendors unilaterally invented and
implemented a feature that makes the resolution randomly fail. The issue
is sneaky because it happens with very low probablility (something like
1/64k^2 * #ofZonesVulnerableToKeytagConflict), it might not yet have
even happened at all, but it might break things terribly.

I have trouble working up much enthusiasm for spending time on a problem that
happens with such low probability that as far as we know it has never happened,
and may well not happen between now and the time that the DNS is replaced by
something else.  Also, it is my impression that existing resolvers can all 
handle
single collisions, so keeping in mind that there's about 14 bits of randomness
in real keytags, it's more like (1/16k^3 * #nzones).

If anyone has ever come across a keytag collision with more than two valid keys,
I would really like to hear about it. Zones you deliberately created by
generating millions of keys until they collide probably don't count.

I'm happy to see Libor, who would potentially be doing all the hard work without seeing any benefit on the auth side, understands the problem.

Unfortunately John's reaction above is so inaccurate it is actually wrong.

Compare these two statements:
- Validators limit "number of collisions". (see above)
- Validators limit on number of "validation attempts". (reality today)

This is important because RRSIGs can be invalid for _other_ reasons than collisions. If we continue allowing collisions it will deplete the limited amount of work validator is willing to do and this removes redundancy/headroom from the overall system.

As Libor pointed out, this bites at unexpectedly and is hard to debug. We at ISC experienced issues with this as soon as KeyTrap fixes were released. Turns out some domains had mysteriously incorrect RRSIGs along with correct RRSIGs and nobody noticed until validation exploded, it was formerly hidden by unlimited retries.

Ad handwaving 'it is not a performance issue', I invite you to play with command:
openssl speed -seconds 1 ecdsa384

The number it reports is upper bound for cache-miss QPS, without any collisions allowed. I'm looking forward to experiment which show that chain of CNAMEs can eat that much CPU.

--
Petr Špaček
Internet Systems Consortium

_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to