https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545

--- Comment #11 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #9)
> Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653731.html
> 
> Rerunning benchmarks with this patch would be very welcome.

OK, I have tested the patches on AArch64 and also compared against our
vectorized cases.

I tested 5 needle positions (where the element it's searching for is found):

1
0
100
1000
10000
-1

where 0 is never find, and -1 means fine last:

Results are:

+--------+-----------+---------+---------+
| NEEDLE | scalar 1x | vect    | memchr  |
+--------+-----------+---------+---------+
| 1      | -11.67%   | -14.95% | -12.92% |
| 0      | 137.48%   | -82.31% | -83.36% |
| 100    | 3.75%     | 17.06%  | 8.02%   |
| 1000   | -10.34%   | -10.83% | 0.29%   |
| 10000  | -3.25%    | -4.97%  | -5.19%  |
| -1     | 10.28%    | -31.17% | -33.93% |
+--------+-----------+---------+---------+

So it looks like, staying with scalar the unrolling still has a positive
effect, but calling memchr has an effect on longer searches, shorter ones. 

the vector loop as currently vectorized has about a 10% unneeded overhead which
we will be working on this year.  But otherwise it's also a significant win for
longer searches.

So perhaps an idea is to use memchr for bytes, for everything else remove the
unrolled code and let the vectorizer take care of it, and if that fails let the
RTL or tree unroller do it?

for completeness, my benchmark was:

#include <algorithm>
#include <string>
#include <fstream>
#include <sstream>
#include <stdio.h>

#ifndef NEEDLE
#define NEEDLE 50
#endif

#ifndef ITERS
#define ITERS 1000
#endif

__attribute__((noipa))
bool
doIt (const char* s, char v, size_t len)
{
  const char* l = s + len;
  const char* r = std::find (s, l, v);
  return (r != l);
}

int main ()
{
  std::ifstream t("find.data");
  std::stringstream buffer;
  buffer << t.rdbuf();
  std::string content = buffer.str();
  if (NEEDLE > 0)
    content[NEEDLE-1] = '|';
  else if (NEEDLE < 0)
    content[content.length()-1] = '|';

  bool found = false;
  for (int i = 0; i < ITERS; i++)
     found = found | doIt (content.c_str (), '|', content.length ());

  return found;
}

Reply via email to