Thus said Sasha Pachev on Fri, 10 Mar 2006 14:20:07 MST:
> * write a program in the language of your choice that will output a
> list of unique words and their counts found in an arbitrary file
> (given as an argument to your program) that are also present in the
> English dictionary. Write two versions of the program: one that
> requires the minimum development time on your part, and one that
> executes as fast as possible. Benchmark both versions.
Here's my quick and dirty one-liner in shell:
for I in `strings filename.txt`; do blah=`echo $I | sed -e 's/[^a-zA-Z]//g' -e
'/^$/d'`; if grep "^$blah$" /usr/share/dict/words; then echo $blah; fi; done |
awk '{ a[$1]++ } END { for (b in a) printf("%-15s %6d\n", b, a[b]); }'
Make sure you change ``filename.txt'' and ``/usr/share/dict/words''
appropriately.
I'll get the optimized version when I get more time. :-)
Andy
--
[-----------[system uptime]--------------------------------------------]
6:04pm up 105 days, 7:33, 1 user, load average: 0.00, 0.00, 0.00
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/