There's no really easy way to avoid O(n^2) comparison when you want to
check everything against everything else in some set. One efficiency
that halves the size of the job (but does not reduce big-O) is to
check only against the later items:
(defn pairs-off-one [strs]
(let [istrs (map-indexed vector strs)]
(for [[n1 s1] istrs
[n2 s2] istrs
:when (and
(< n1 n2)
(= (reduce + (map #(if (= %1 %2) 0 1) s1 s2)) 1))]
[s1 s2])))
=> (pairs-off-one ["abc" "abd" "aed" "axf" "zqr" "zbc" "aqd"])
(["abc" "abd"] ["abc" "zbc"] ["abd" "aed"] ["abd" "aqd"] ["aed" "aqd"])
I don't think you'll get much better than that even with things like
radix or finger trees. If you were looking for strings with a long
common prefix radix trees would help enormously, but the differences
can be anywhere in your strings. On the brighter side, if the strings
are of length 3 and usually confined to printable US-ASCII characters
you have only about 96^6 comparisons to do in the absolute worst case.
:)
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en