Collin Funk wrote:
> While looking at the regex code for glibc I had the feeling like it
> would be great to clean things up a bit. E.g., it has many overflow
> checks like this:
> 
>     /* Avoid overflow.  */
>     if (__glibc_unlikely (MIN (IDX_MAX, SIZE_MAX / sizeof (re_dfastate_t *))
>                           <= match_last))
>       return REG_ESPACE;
> 
>     sifted_states = re_malloc (re_dfastate_t *, match_last + 1);
> 
> Since the code predates reallocarray. Many variables could also have
> their scope restricted, which I know you will agree with.
> 
> Maybe some small janitorial work would make it easier to implement the
> POSIX.1-2024 features, but perhaps I am being too optimistic.

If you want to do janitorial work on the regex code, that is certainly one
way to get familiar with it. Whether that provides sufficient understanding
for implementing the "repetition modifier '?'" (leftmost shortest possible
match), I don't know.

My plans are to
  - provide a more extensive test suite,
  - add some benchmarks,
  - add the minrx matcher (https://github.com/mikehaertel/minrx) as an
    alternative to the 'regex' module,
  - compare the two implementations of 'regex' based on this test suite
    and on the benchmarks.

Bruno




Reply via email to