No, I disagree.
On 19/11/2025 03:47, John Levine wrote:
It appears that Petr Menšík<[email protected]> said:
If some application desperately wants UTF-8 in DNS RDATA, TXT records
are not in my view the best vehicle for that.
The spec has always been clear that TXT records are strings of
arbitrary 8-bit data. If you want to put a particular interpretation
on some TXT records, pick an underscore _prefix and write a spec that
says what the format of the records is. See this registry for a dozen
examples:
https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#underscored-globally-scoped-dns-node-names
RFC 1035 says about TXT record:
TXT RRs are used to hold descriptive text. The semantics of the text
depends on the domain where it is found.
Now it does not specify anything about escaping the text when any byte
with value >=127 is used. Yes, it is 8bit data. But records designed to
hold generic binary data use presentation format in base64. TXT does not
use that by design. Your assumption UTF-8 code points are not letters,
just because they have higher byte value, are wrong in my opinion.
ASCII only text is always valid UTF-8. There does not need to by any
change to process utf-8 encoded zone file. When you escape it, then it
is escaped. When you do that, it becomes unreadable by humans.
consumes them in wire format, it will not matter or change anything. Do
you know application, which consumes binary data from TXT record from
their presentation format?
Every authoritative DNS server does that when it reads a master file.
The binary stuff is represented with decimal escapes, but so what,
it's mechanically generated and mechanically consumed.
No, zone files can be often maintained by people in form of text files.
I do not reason why TXT records should be present in escaped form. These
letters are not _binary_, they are letters encoded in higher value bytes
only. They are still letters. Escaping can always be used on systems not
able to cope with UTF-8 normal form. Unless you do some iso-8859-1 to
utf-8 or reverse conversion, it won't break. If you do that, please stop
that at once.
How often do you store records encoded in unknown record format? It can
specify anything. But it is not simple to work with. Can you guess what
is written in unknown record? That why we use normal presentation form
whenever possible. I ask to do that also for TXT records.
Current way is selective. It does not use base64 or similar encoding for
normal ASCII letters. But it prevents using unicode text in useful form.
That's how master files have been for 40 years. They're not going to change
now.
R's,
John
I am not trying to change master file format. I want it consumable in
its raw 8bit utf-8 form. Common, this is not SMTP protocol where 8 bits
usage causes a problem. Both ldns-read-zone and named-compilezone
understands raw 8bit form. It does not need any change. It converts
readable text with utf-8 nice text to unreadable escaped text.
It can process it. I want it stop escaping unless it has very specific
reason to do so. It knows what is space and what is not. It is okay to
escape quotes or similar data not permitted inside records. As long as
it can identify it is still in inside record data, it can use binary
input directly.
Can you please find me, where is escaping data in TXT records specified
as mandatory? I did not find it anywhere. It seems to be just a custom.
It seems not neccesary one to me.
TXT "Zkouška"
TXT "testíček"
This is correctly read by tools. I do not demand it has to be used this
way. Escaping is always possible. But escaping is not needed.
named-checkzone won't report any issue. It reads it correctly as binary
input. I agree, it is binary safe.
This is year 2025. I think we can stop pretending everything non-ASCII
is not printable as it is. Stop pretending only english letters belong
into DNS for some reason.
I would like everyone commenting on those to state what was their first
language. Did you grow in world where ASCII can represent every name or
word? Then you might not understand why this is important to people from
different backgrounds.
Regards,
Petr
--
Petr Menšík
Software Engineer, RHEL
Red Hat,https://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
_______________________________________________
DNSOP mailing list -- [email protected]
To unsubscribe send an email to [email protected]