On 07/27/2010 06:28 PM, Pádraig Brady wrote:
On 27/07/10 19:14, Paolo Bonzini wrote:
On 07/27/2010 06:06 PM, Pádraig Brady wrote:
I would suggest a new function due to the
way I see this function called most often.
I.E. repeatedly with the same character.
Is this really a bottleneck? i.e., what does u8_uctomb_aux look like in
the profile when do a million u8_strchr calls on an empty string?
Well it would be a bit faster,
but mainly a bit easier to use.
I.E. one could do stuff like:
while ((f=u8_str_u8_chr (s, "–", 3));
Ok, that's a different usecase that makes more sense. I thought you
referred to something like
char c[6];
size_t size = u8_uctomb_aux (c, uc, sizeof c);
...
while ((f=u8_str_u8_chr (s, c, size)));
This one instead is less likely to be useful.
However, note that in C1X you could do
while ((f=u8_strchr (s, u'–')));
BTW, there's an interesting difference between char32_t and ucs4_t, in
that the former has "the same size, signedness, and alignment as
uint_least32_t", while libunistring uses uint32_t to define the latter.
I wonder if libunistring should be changed to:
1) detect _Char32_t (or uchar.h and char32_t) and use it if available,
2) use uint_least32_t if not available.
It would be a no-op everywhere except possibly for some C++ programs,
and it wouldn't affect binary compatibility.
Paolo