> On 25 Nov 2025, at 09:48, Petr Menšík <[email protected]>
> wrote:
>
> On 19/11/2025 18:04, Michael Sweet wrote:
>> (Entering this discussion a little late, I only just became aware of this
>> thread...)
>>
>> I've be peripherally involved with mDNS and DNS-SD for a long time, and more
>> specifically with how DNS-SD and TXT records are used for printing.
>>
>> For printers, values are always Network Unicode (RFC 5198) which is UTF-8
>> NFC without most control characters. This is consistent with IPP (STD 92)
>> and how we've mapped the old Printer MIB v2 and Host MIB values which use a
>> separate MIB property to specify the character set to IPP and other print
>> protocols/encodings which exclusively use Network Unicode.
>>
>> >From a software development perspective, I suspect you can reliably detect
>> >when a TXT record string contains valid UTF-8 and show the contents as text
>> >or hex data otherwise (or maybe have a toggle?) Not sure what Wireshark
>> >does for TXT records, but that might be a place to look for inspiration...
>>
>> ________________________
>> Michael Sweet
>>
> Ah, did not know there is something like Network Unicode. Sure, that should
> be enough.
> Tested wireshark and that is not the best inspiration. It does neither
> escaping nor proper display on my system, which has UTF-8 locale. Not sure
> which encoding it tries to display, but the result is even worse than
> escaping. Information is lost by it. Question mark in box appears.
> latin2: zku�ebn� z�znam
> zku��ebn�� z��znam
> Tried also tcpdump.
> pihhan.info. TXT "zkuM-EM-!ebnM-CM-- zM-CM-!znam", pihhan.info. TXT "latin2:
> zkuM-9ebnM-m zM-aznam"
> Event more surprising result. Not sure where those values came from. It
> misinterprets the contents to be something different, but prints some ASCII
> sequence instead. Seems like a bug.
> It seems like a good example why some recommendation about how to interpret
> TXT records would be desirable.
> Cheers, Petr
wireshark says it can’t display the octet which is fine. You have
the hex dump to look at.
tcpdump has its convention for printing non printable ascii which
it uses everywhere it tries to emit text values. M- if the high
bit is set followed by the lower 7 bits. If the lower 7 bits are
control values the ^ followed the value xor with 0x40. It isn’t
quite lossless as it doesn’t escape M- nor ^ but is sufficient to
not be a security risk.
void
fn_print_char(netdissect_options *ndo, u_char c)
{
if (!ND_ISASCII(c)) {
c = ND_TOASCII(c);
ND_PRINT("M-");
}
if (!ND_ISPRINT(c)) {
c ^= 0x40; /* DEL to ?, others to alpha */
ND_PRINT("^");
}
ND_PRINT("%c", c);
}
Both applications have made sensible decisions.
Dig uses ‘.’ when printing out unknown EDNS options when the value
is not printable ASCII.
We already have instructions for how to display TXT records. You
happen to not like them but they exist.
> --
> Petr Menšík
> Senior Software Engineer, RHEL
> Red Hat, https://www.redhat.com/
> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
> _______________________________________________
> DNSOP mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: [email protected]
_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]