Question for the experts. Let's take the following example:
----->8------------->8--------------------
#include <stdio.h>
#include <string.h>
#include <wchar.h>
#define period 0x2e
#define question 0x3f
#define exclam 0x21
#define ellipsis L'\u2026'
const wchar_t p[] = { period, question, exclam, ellipsis };
int
main()
{
const wchar_t s[] = L". Hello.";
printf("%ls\n", s);
printf("%lu\n", wcsspn(s, p));
return 0;
}
-------------8<-----------8<----------------
Now run:
$ cc -Wall example.c -o example && ./example
. Hello.
8
$ egcc -Wall example.c -o example && ./example
. Hello.
1
As you see, compiled with GCC the program does what is expected. To get
the desired result with CLANG you have to write the string literally.
Change the declaration of p[] above to:
const wchar_t p[] = L".?!?";
^ This is a UTF-8 ellipsis.
And now:
$ cc -Wall example.c -o example && ./example
. Hello.
1
Using only ASCII or only UTF-8 in the array also works.
Is this a bug in clang's wcsspn() or I'm wrong in assuming that the
array can be declared in the way I did?
--
Walter