[DNSOP] Re: Character encoding in DNS

Andrew Sullivan Sat, 15 Nov 2025 07:13:17 -0800

Dear colleagues,

On Fri, Nov 14, 2025 at 10:38:32PM -0500, Dave Lawrence wrote:

Okay, cool.  I think for a new definition like this you can just
declare that the TXT RDATA should be interpreted as a continuous UTF-8
string without regard to the 255 octet segment boundaries.  It feels
like maybe 5.2 has slightly too many words for accomplishing this, but
seems okay enough.


Please do not do that.  There was a whole WG (PRECIS) that worked out all manner of 
mechanisms for designing protocol elements and how they need to be handled.  At the very 
least, it's going to need to specify what normalization form needs to be used; but my 
guess is that there are large parts of the Unicode range that are inappropriate for this 
use (full disclosure: I have read the title of this draft and no more).  What you should 
_never_ do in a protocol element is "just dump UTF-8 in there," because it 
causes problems when one does that.  See the output of the PRECIS WG to see how and why 
and what to do about it.

I'd be curious to know how various operations portals handle the
record creation via their normal "Create TXT" interface, but half
expect that having UTF-8 pasted in will pretty much just work out.


Yes, which is of course why overloading TXT this way is risky.  If I paste 
NFC-normalized UTF-8 strings, and you paste NFD-normalized UTF-8 strings, even 
if the strings look exactly the same to a human they will not compare as 
equivalent.

Best regards,

A
--
Andrew Sullivan
[email protected]

_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[DNSOP] Re: Character encoding in DNS

Reply via email to