On Sat, 2016-03-05 at 23:46 +0100, Bernhard Reutner-Fischer wrote: [...] > diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c > index 405bae0..72ed311 100644 > --- a/gcc/fortran/misc.c > +++ b/gcc/fortran/misc.c [...]
> @@ -274,3 +275,41 @@ get_c_kind(const char *c_kind_name,teropKind_tki > nds_table[]) > > return ISOCBINDING_INVALID; > } > + > + > +/* For a given name TYPO, determine the best candidate from > CANDIDATES > + perusing Levenshtein distance. Frees CANDIDATES before > returning. */ > + > +const char * > +gfc_closest_fuzzy_match (const char *typo, char **candidates) > +{ > + /* Determine closest match. */ > + const char *best = NULL; > + char **cand = candidates; > + edit_distance_t best_distance = MAX_EDIT_DISTANCE; > + > + while (cand && *cand) > + { > + edit_distance_t dist = levenshtein_distance (typo, *cand); > + if (dist < best_distance) > + { > + best_distance = dist; > + best = *cand; > + } > + cand++; > + } > + /* If more than half of the letters were misspelled, the > suggestion is > + likely to be meaningless. */ > + if (best) > + { > + unsigned int cutoff = MAX (strlen (typo), strlen (best)) / 2; > + > + if (best_distance > cutoff) > + { > + XDELETEVEC (candidates); > + return NULL; > + } > + XDELETEVEC (candidates); > + } > + return best; > +} FWIW, there are two overloaded variants of levenshtein_distance in gcc/spellcheck.h, the first of which takes a pair of strlen values; your patch uses the second one: extern edit_distance_t levenshtein_distance (const char *s, int len_s, const char *t, int len_t); extern edit_distance_t levenshtein_distance (const char *s, const char *t); So one minor tweak you may want to consider here is to calculate strlen (typo) once at the top of gfc_closest_fuzzy_match, and then pass it in to the 4-arg variant of levenshtein_distance, which would avoid recalculating strlen (typo) for every candidate. I can't comment on the rest of the patch (I'm not a Fortran expert), though it seems sane to Hope this is constructive Dave