On 07/27/2010 06:28 PM, Pádraig Brady wrote:
On 27/07/10 19:14, Paolo Bonzini wrote:
On 07/27/2010 06:06 PM, Pádraig Brady wrote:

I would suggest a new function due to the
way I see this function called most often.
I.E. repeatedly with the same character.

Is this really a bottleneck?  i.e., what does u8_uctomb_aux look like in
the profile when do a million u8_strchr calls on an empty string?

Well it would be a bit faster,
but mainly a bit easier to use.
I.E. one could do stuff like:

   while ((f=u8_str_u8_chr (s, "–", 3));

Ok, that's a different usecase that makes more sense. I thought you referred to something like

  char c[6];
  size_t size = u8_uctomb_aux (c, uc, sizeof c);
  ...
  while ((f=u8_str_u8_chr (s, c, size)));

This one instead is less likely to be useful.

However, note that in C1X you could do

  while ((f=u8_strchr (s, u'–')));

BTW, there's an interesting difference between char32_t and ucs4_t, in that the former has "the same size, signedness, and alignment as uint_least32_t", while libunistring uses uint32_t to define the latter. I wonder if libunistring should be changed to:

1) detect _Char32_t (or uchar.h and char32_t) and use it if available,

2) use uint_least32_t if not available.

It would be a no-op everywhere except possibly for some C++ programs, and it wouldn't affect binary compatibility.

Paolo

Reply via email to