What would you think of this form of coding ?
- The rationale is to separate functions that deal with system
"boundaries" from "core algorithmic functions".
So you should at least have two functions : one that does not deal
with input/output formats : will only deal with clojure/java
constructs.
- Don't expose "too early" functions that are just here to simplify
the algorithm : there's already the possibility to use defn- , but
there's also the possibility to embed functions in the principal
function by using let and inner functions
- And I also tried to write the "core algorithmic function" as
"functional" as I can.
Do you think the functional version is more ore less "obfuscated" ?
Here would be the "core function" (taking a string as an input, and
outputting the sorted sequence of ["word" 2] vectors) :
(defn topwords [str]
"Takes a string as an input, and returns a sequence of vectors of
pairs [word nb-of-word-occurences]"
(let [words (let [ls (System/getProperty "line.separator")]
#(.split % ls))
freqs (partial reduce #(merge-with + %1 {%2 1}) {})
sort (partial sort-by (comp - val))]
(-> str words freqs sort)))
HTH,
--
Laurent
On Dec 26, 4:37 pm, lpetit <[email protected]> wrote:
> Instead of #(- (val %)), one could also use the compose function :
> (comp - val)
>
> My 0,02 EURO,
>
> --
> Laurent
>
> On Dec 25, 4:58 pm, Mibu <[email protected]> wrote:
>
> > My version:
>
> > (defn top-words [input-filename result-filename]
> > (spit result-filename
> > (apply str
> > (map #(format "%s : %d\n" (first %) (second %))
> > (sort-by #(-(val %))
> > (reduce #(conj %1 { %2 (inc (%1 %2 0)) }) {}
> > (map #(.toLowerCase %)
> > (re-seq #"\w+"
> > (slurp
> > input-filename)))))))))
>
> > Mibu
>
> > On Dec 25, 2:16 pm, Piotr 'Qertoip' Włodarek <[email protected]>
> > wrote:
>
> > > Given the input text file, the program should write to disk a ranking
> > > of words sorted by frequency, like:
>
> > > the : 52483
> > > and : 32558
> > > of : 23477
> > > a : 22486
> > > to : 21993
>
> > > My first implementation:
>
> > > (defn topwords [in-filepath, out-filepath]
> > > (def words (.split (.toLowerCase (slurp in-filepath)) "\\s+"))
>
> > > (spit out-filepath
> > > (apply str
> > > (concat
> > > (map (fn [pair] (format "%20s : %5d \r\n" (key pair)
> > > (val pair)))
> > > (sort-by #( -(val %) )
> > > (reduce
> > > (fn [counted-words word]
> > > ( assoc counted-words
> > > word
> > > (inc (get counted-words
> > > word 0)) ))
> > > {}
> > > words)))
> > > ["\r\n"]))))
>
> > > Somehow I feel it's far from optimal. Could you please advise and
> > > improve? What is the best, idiomatic implementation of this simple
> > > problem?
>
> > > regards,
> > > Piotrek
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---