tetsu...@scope-eye.net wrote:
This is also why I think this should be an optional "encoding marker"
---
Why? If it was the current encoding, it wouldn't have high-bits
set. If it had any high bits set, it's fairly simple to
either presume or validate the script as UTF-8, as it is
self-synchronizing.
Please note, the RFE was for UTF-8. That's 1 encoding.
I'm not proposing, suggesting nor wanting a general "locale"
support for arbitrary encodings in bash. That would be
entirely chaotic and its own "Tower of Babel" incident on
its own. This suggestion was to allow 1 global/extra character
encoding: UTF-8.
Regarding what Greg said:
From: "Greg Wooledge" <wool...@eeg.ccf.org>
The main issue here is that the author's locale may not *exist* on
the
user's machine. There may not be any way for bash to determine which
non-ASCII characters constitute "letters" in the author's locale.
---
If we use UTF-8, that won't be an issue. It's an internationalized
encoding that can support all languages and locales. If a script
uses high-bit encoding(s), it can automatically be assumed to be UTF-8.
In a related way, the program "file" already is able to tell
you what encoding a file is in.