On Sat, Jan 10, 2015 at 11:19:20PM +0100, Dmitrij D. Czarkoff wrote:
> FRIGN said:
> > On Sat, 10 Jan 2015 02:52:09 +0100
> > "Dmitrij D. Czarkoff" wrote:
> >
> > > > +#define UPPER "A-Z"
> > > > +#define LOWER "a-z"
> > > > +#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
> > >
> > > These d
On Sat, Jan 10, 2015, at 19:11, Ian D. Scott wrote:
> On Sat, Jan 10, 2015 at 06:56:45PM -0500, random...@fastmail.us wrote:
> Actually, ẞ, capital of ß, was added in Unicode 5.1. There are probably
> others letters with this issue, however.
My main point was that you've got to be careful that
On Sat, Jan 10, 2015 at 06:56:45PM -0500, random...@fastmail.us wrote:
> On Sat, Jan 10, 2015, at 16:47, Markus Wichmann wrote:
> > You wanted to be Unicode compatible, right? Because in that case I
> > expect [:alpha:] to be the class of all characters in General Category L
> > (that is, Lu, Ll, L
On Sat, Jan 10, 2015, at 16:47, Markus Wichmann wrote:
> You wanted to be Unicode compatible, right? Because in that case I
> expect [:alpha:] to be the class of all characters in General Category L
> (that is, Lu, Ll, Lt, Lm, or Lo). That includes a few more characters
> than just A-Z and a-z. And
On Fri, Jan 9, 2015, at 18:39, FRIGN wrote:
> C3B6 is 'ö' and makes sense to allow specifying it as \50102 (in the pure
> UTF-8-sense of course, nothing to do with collating).
Why would someone want to use the decimal value of the UTF-8 bytes,
rather than the unicode codepoint?
Why are you using
FRIGN said:
> On Sat, 10 Jan 2015 02:52:09 +0100
> "Dmitrij D. Czarkoff" wrote:
>
> > > +#define UPPER "A-Z"
> > > +#define LOWER "a-z"
> > > +#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
> >
> > These definitions hugely misrepresent corresponding character classes.
>
> I interpreted the
On Sat, Jan 10, 2015 at 08:51:03PM +0100, FRIGN wrote:
> On Sat, 10 Jan 2015 02:52:09 +0100
> "Dmitrij D. Czarkoff" wrote:
>
> > > +#define UPPER "A-Z"
> > > +#define LOWER "a-z"
> > > +#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
> >
> > These definitions hugely misrepresent corresponding
On Sat, 10 Jan 2015 02:52:09 +0100
"Dmitrij D. Czarkoff" wrote:
> > +#define UPPER "A-Z"
> > +#define LOWER "a-z"
> > +#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
>
> These definitions hugely misrepresent corresponding character classes.
I interpreted the character classes by default for