Re: [groff] Regularize (sub)section cross references.

G. Branden Robinson Mon, 17 Dec 2018 09:56:22 -0800

At 2018-12-18T04:42:36+1100, John Gardner wrote:
> > The biggest problem I know of is that the uppercasing transform of
> > German sharp S "ß" goes to "SS"
> 
> Pretty damn sure that's nothing compared to the Turkish dotless I
> <https://en.wikipedia.org/wiki/Dotted_and_dotless_I#In_computing>.
> 
> Then again, I'm sure they're used to seeing computers screw up the tittle
> by now... :-)


I'm aware of it.  :)  But I still regard it as a lesser problem because
at least it doesn't change the length of the string in glyphs or
codepoints.

(
Bytes?  In UTF-8, yup, it sure would:

U+0069 LATIN SMALL LETTER I
UTF-8: 69 UTF-16BE: 0069 Decimal: &#105; Octal: \0151
i (I)
Uppercase: 0049 [EXCEPT IN TURKISH -- GBR]
Category: Ll (Letter, Lowercase)
Unicode block: 0000..007F; Basic Latin
Bidi: L (Left-to-Right)

U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE
UTF-8: c4 b0 UTF-16BE: 0130 Decimal: &#304; Octal: \0460
İ (i)
Lowercase: 0069
Category: Lu (Letter, Uppercase)
Unicode block: 0100..017F; Latin Extended-A
Bidi: L (Left-to-Right)
Decomposition: 0049 0307
)

A lot of knowledge is embedded in tolower() and toupper() these days.
Back in the '70s and '80s they were just syntactic sugar for adding and
subtracting 32.

Life is more interesting now.

Regards,
Branden

signature.asc
Description: PGP signature

Re: [groff] Regularize (sub)section cross references.

Reply via email to