Re: [PATCH, fortran, v3] Use Levenshtein spelling suggestions in Fortran FE

David Malcolm Mon, 07 Mar 2016 06:58:42 -0800

On Sat, 2016-03-05 at 23:46 +0100, Bernhard Reutner-Fischer wrote:
[...]

> diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c
> index 405bae0..72ed311 100644
> --- a/gcc/fortran/misc.c
> +++ b/gcc/fortran/misc.c
[...]


> @@ -274,3 +275,41 @@ get_c_kind(const char *c_kind_name,teropKind_tki
> nds_table[])
>  
>    return ISOCBINDING_INVALID;
>  }
> +
> +
> +/* For a given name TYPO, determine the best candidate from
> CANDIDATES
> +   perusing Levenshtein distance.  Frees CANDIDATES before
> returning.  */
> +
> +const char *
> +gfc_closest_fuzzy_match (const char *typo, char **candidates)
> +{
> +  /* Determine closest match.  */
> +  const char *best = NULL;
> +  char **cand = candidates;
> +  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
> +
> +  while (cand && *cand)
> +    {
> +      edit_distance_t dist = levenshtein_distance (typo, *cand);
> +      if (dist < best_distance)
> +     {
> +        best_distance = dist;
> +        best = *cand;
> +     }
> +      cand++;
> +    }
> +  /* If more than half of the letters were misspelled, the
> suggestion is
> +     likely to be meaningless.  */
> +  if (best)
> +    {
> +      unsigned int cutoff = MAX (strlen (typo), strlen (best)) / 2;
> +
> +      if (best_distance > cutoff)
> +     {
> +       XDELETEVEC (candidates);
> +       return NULL;
> +     }
> +      XDELETEVEC (candidates);
> +    }
> +  return best;
> +}

FWIW, there are two overloaded variants of levenshtein_distance in
gcc/spellcheck.h, the first of which takes a pair of strlen values;
your patch uses the second one:

extern edit_distance_t
levenshtein_distance (const char *s, int len_s,
                      const char *t, int len_t);

extern edit_distance_t
levenshtein_distance (const char *s, const char *t);

So one minor tweak you may want to consider here is to calculate
  strlen (typo)
once at the top of gfc_closest_fuzzy_match, and then pass it in to the
4-arg variant of levenshtein_distance, which would avoid recalculating
strlen (typo) for every candidate.

I can't comment on the rest of the patch (I'm not a Fortran expert),
though it seems sane to 

Hope this is constructive
Dave

Re: [PATCH, fortran, v3] Use Levenshtein spelling suggestions in Fortran FE

Reply via email to