Yes, it is binary data. Any binary content is permitted. It only depends
in what ways you choose to display it.
On 19/11/2025 21:58, Andrew Sullivan wrote:
On Wed, Nov 19, 2025 at 07:09:13PM -0500, Marco Davids (IETF) wrote:
That said, I prefer not to pre-emptively include such guidance in my
draft.
The current text seems sufficient and in line with the style and
intent of other I-Ds and RFCs:
It includes a paragraph ensuring interoperability (e.g., input from a
sender such as 'この美しいドメイン名を購入してください。' is correctly
interpreted by the receiver) and cautions in Security Considerations
on careful parsing.
The advice is inadequate, if you're going to require people to
interpret a series of octets as octets in a UTF-8-encoded string. At
the very least, you need to specify whether automatic processing of
any kind of that content is permitted. If it _is_ permitted (and it
would appear to me that it is, given what you say llater about careful
parsing &c.) , then it seems to me you're going to have to specify
limits on what code points may or may not be included, normalization
forms, &c. If you don't specify all of that, then attempting to
interpret the octets in the RDATA as being UTF-8-encoded strings will
be at least fragile.
How we got from binary only data to automatic processing, normalization
form of some kind? If it contains only printable characters encoded in
verified encoding, it is reasonably safe to not escape each byte.
It is not clear from the rest of the document whether the "use UTF-8"
principle is in effect for all the subtypes possible in the record, or
only in the ftxt subtype. For instance, is the host part of an furi
entry required to be an ASCII string (i.e. if it's an IDN, must it be
the A-label form?) or may it include UTF-8 strings beyond the
ASCII-equivalent range? It seems to me it would be valuable to
specify which is meant.
Best regards,
A
Domain labels are out of scope, IDN is unrelated. They have to be
compared case-insensitive, which require to decode each code point and
locate proper lowercase/uppercase letter matching the source. This is
only about how to display content of records, where no sorting or case
insensitive comparisons need to be done. We have some safety checks to
not output complete garbage, escaping is always possible when in doubt.
But no form of normalisation needs to be done. Contents are produced
somewhere else.
Content of records remain application specific. If I look on google.com
TXT response, I do not see any escaped data. Even if it contains also
binary contents in some base64 encoding.
--
Petr Menšík
Senior Software Engineer, RHEL
Red Hat, https://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]