Re: msgmerge speedup: fstrcmp and diffseq improvements

Bruno Haible Mon, 22 Sep 2008 03:35:23 -0700

Hello Ralf,

Ralf Wildenhues wrote:
> BTW, I now found a newer paper that has some more interesting stuff, if
> a bit overkill for msgmerge: <http://fastss.csg.uzh.ch/ifi-2007.02.pdf>.
> Also, there is this good overview of algorithms:
> <ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/survasm.ps.gz>.


Thanks for these pointers. The first one explains how Google's "did you mean"
feature works, but is hardly applicable here, because for a string of length
100, msgmerge is looking for similar strings with possible 30 differences
(not just 2 or 3 differences). The second one is very interesting, especially
the filtering algorithms look fancy - but also hard to put in place here
because they are good for ɑ < 0.1, whereas here ɑ can be 0.3. The algorithm
that appears to be most useful for our range of parameters is BPM (bit-
parallelized Myers). But I wonder whether it can combine with the dynamic
optimizations that we have already put in place - namely, that the lower_bound
is much higher for the later messages being compared than for the first ones
in the list.

Bruno

Re: msgmerge speedup: fstrcmp and diffseq improvements

Reply via email to