Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm

Paolo Bonzini Tue, 27 Jul 2010 09:39:24 -0700

On 07/27/2010 06:28 PM, Pádraig Brady wrote:

On 27/07/10 19:14, Paolo Bonzini wrote:

On 07/27/2010 06:06 PM, Pádraig Brady wrote:


I would suggest a new function due to the
way I see this function called most often.
I.E. repeatedly with the same character.


Is this really a bottleneck?  i.e., what does u8_uctomb_aux look like in
the profile when do a million u8_strchr calls on an empty string?


Well it would be a bit faster,
but mainly a bit easier to use.
I.E. one could do stuff like:

   while ((f=u8_str_u8_chr (s, "–", 3));

Ok, that's a different usecase that makes more sense. I thought youreferred to something like


  char c[6];
  size_t size = u8_uctomb_aux (c, uc, sizeof c);
  ...
  while ((f=u8_str_u8_chr (s, c, size)));

This one instead is less likely to be useful.

However, note that in C1X you could do

  while ((f=u8_strchr (s, u'–')));

BTW, there's an interesting difference between char32_t and ucs4_t, inthat the former has "the same size, signedness, and alignment asuint_least32_t", while libunistring uses uint32_t to define the latter.I wonder if libunistring should be changed to:


1) detect _Char32_t (or uchar.h and char32_t) and use it if available,

2) use uint_least32_t if not available.

It would be a no-op everywhere except possibly for some C++ programs,and it wouldn't affect binary compatibility.


Paolo

Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm

Reply via email to