tmountain a écrit :
> Very cool. I actually cleaned up the code a little bit more this
> morning trying to speed things up a bit. It's still not as fast as I'd
> like, but I'm not up to speed on Closure optimization either, so I
> could be missing something.
>
There are two things that I noticed in your code:
- you use nth on seq (linear access),
- you append elements to seqs.
It would be better to use vectors instead of seqs:
- random access,
- when you conj an element to a vector it is appended.
Below is the "vectorized" version, it runs (on my box) twice as fast as
your original code.
(I also removed the in-loop building of the string because it was needless.)
(ns markov
(use clojure.contrib.str-utils))
(defn rand-elt
"Return a random element of this vector or seq"
[s]
(nth s (rand-int (count s))))
(defn clean [txt]
"clean given txt for symbols disruptive to markov chains"
(let [new-txt (re-gsub #"[:;,^\"()]" "" txt)
new-txt (re-gsub #"'(?!(d|t|ve|m|ll|s|de|re))" "" new-txt)]
new-txt))
(defn chain-lengths [markov-chain]
"return a set of lengths for each element in the collection"
(let [markov-keys (map keys markov-chain)]
(set (for [x markov-keys] (count x)))))
(defn max-chain-length [markov-chain]
"return the length lf the longest chain"
(apply max (chain-lengths markov-chain)))
(defn chain
"Take a list of words and build a markov chain out of them.
The length is the size of the key in number of words."
([words]
(chain words 3))
([words length]
(let [words (concat (repeat length nil) words)
suffixes (take-while #(seq (drop length %)) (iterate rest words))]
(reduce (fn [markov-chain [a b c d]]
(merge-with into markov-chain {[a b c] [d]}))
{} suffixes))))
(defn split-sentence [text]
"Convert a string to a collection on common boundaries"
(filter seq (re-split #"[,.!?()\d]+\s*" text)))
(defn file-chain
"Create a markov chain from the contents of a given file"
([file]
(file-chain file 3))
([file length]
(let [sentences (split-sentence (slurp file))]
(reduce #(merge-with into %1 (chain (re-split #"\s+" %2))) {}
sentences))))
(defn construct-sentence
"Build a sentence from a markov chain structure. Given a
Markov chain (any size key), Seed (to start the sentence) and
Proc (a function for choosing the next word), returns a sentence
composed until is reaches the end of a chain (an end of sentence)."
([markov-chain]
(construct-sentence markov-chain nil rand-elt))
([markov-chain seed]
(construct-sentence markov-chain seed rand-elt))
([markov-chain seed proc]
(let [seed (or seed (rand-elt (keys markov-chain)))
next-key #(concat (rest %) [(proc (markov-chain %))])
logorrhea (map first (iterate next-key seed))
sentence (take-while identity (drop-while nil? logorrhea))]
(str-join " " sentence))))
hth,
Christophe
--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---