[Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread Josep M. Fontana
Hi, I'm trying to do something that should be very simple. I want to generate a list of the words that appear in a document according to their frequencies. So, the list generated by the script should be something like this: the : 3 book: 2 was : 2 read: 1 by: 1 [...] This would be obtained from

Re: [Tutor] Simple counter to determine frequencies of words in adocument

2010-11-20 Thread Alan Gauld
"Josep M. Fontana" wrote The code I started writing to achieve this result can be seen below. You will see that first I'm trying to create a dictionary that contains the word as the key with the frequency as its value. Later on I will transform the dictionary into a text file with the desire

Re: [Tutor] JOB AD PROJECT

2010-11-20 Thread Alan Gauld
wrote I have done some extensive reading on python. i want to design a classifieds site for jobs. Have you done any extensive programming yet? Reading alone will be of limited benefit and jumping into a faurly complex web project before you have the basics mastered will be a painful experien

Re: [Tutor] Simple counter to determine frequencies of words in adocument

2010-11-20 Thread Peter Otten
Alan Gauld wrote: > The loop is a bit clunky. it would be clearer just to iterate over > a_list: > > for item in a_list: >words[item] = a_list.count(item) This is a very inefficient approach because you repeat counting the number of occurrences of a word that appears N times N times: >>> w

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread Steven D'Aprano
Josep M. Fontana wrote: def countWords(a_list): words = {} for i in range(len(a_list)): item = a_list[i] count = a_list.count(item) words[item] = count return sorted(words.items(), key=lambda item: item[1], reverse=True) with open('output.txt', 'a') as token_f

[Tutor] code quest

2010-11-20 Thread Kirk Bailey
OK, I need to create or find a function that will return a list of DIRECTORIES (only) which are under 'the current directory'. Anyone got some clue on this? Please advise. -- end Very Truly yours, - Kirk Bailey, Largo Florida kniht

Re: [Tutor] code quest

2010-11-20 Thread Emile van Sebille
On 11/20/2010 11:03 AM Kirk Bailey said... OK, I need to create or find a function that will return a list of DIRECTORIES (only) which are under 'the current directory'. Anyone got some clue on this? Please advise. Use os.walk Emile Help on function walk in module os: walk(top, topdown=T

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread Josep M. Fontana
Thanks Alan, Peter and Steve, Instead of answering each one of you independently let me try to use my response to Steve's message as the basis for an answer to all of you. It turns out that matters of efficiency appear to be VERY important in this case. The example in my message was a very short

[Tutor] List help

2010-11-20 Thread george wu
x=0 y=0 w=raw_input("Input: ") w=list(w) for x in range(len(w)): a=w[x] t=0 print a if a==2 or a==4 or a==6 or a==8 or a==10: t=a/2 print "hi" When I run this program, it doesn't print "hi". Can you please tell me why? ___ Tutor

Re: [Tutor] code quest

2010-11-20 Thread Alan Gauld
"Kirk Bailey" wrote OK, I need to create or find a function that will return a list of DIRECTORIES (only) which are under 'the current directory'. Anyone got some clue on this? Please advise. You can use os.walk() to get a list of directories and files and then just throw away the files...

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread ALAN GAULD
If the file is big use Peter's method, but 45 minutes still seems very long so it may be theres a hidden bug in there somehwew. However... > When I look at the current processes running on my computer, I see the > Python process taking 100% of the CPU. Since my computer has a > multi-core pr

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread Martin A. Brown
Good evening, : It turns out that matters of efficiency appear to be VERY : important in this case. The example in my message was a very : short string but the file that I'm trying to process is pretty : big (20MB of text). Efficiency is best addressed first and foremost, not by hardware,

Re: [Tutor] List help

2010-11-20 Thread Emile van Sebille
On 11/20/2010 11:06 AM george wu said... x=0 y=0 w=raw_input("Input: ") w=list(w) for x in range(len(w)): a=w[x] t=0 print a if a==2 or a==4 or a==6 or a==8 or a==10: t=a/2 print "hi" When I run this program, it doesn't print "hi". Can you please tell me why? When you're

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread Alan Gauld
"Martin A. Brown" wrote * Somebody will be certain to point out a language or languages that provide some sort of facility to abstract the use of multiple processors without the explicit use of threads. ISTR Occam did that? Occam being the purpose designed language for the transputer,

Re: [Tutor] Simple counter to determine frequencies of words in a document

2010-11-20 Thread col speed
--> snip >However, even with countWords2, which is supposed to overcome this >problem, it feels as if I've entered an infinite loop. >Josep M. Just my twopenneth, I'm a noob and I'm not going to try such a big file on my old machine, but: 1. Maybe create a *set* from the wordlist, loop through

Re: [Tutor] Simple counter to determine frequencies of words in adocument

2010-11-20 Thread Alan Gauld
"col speed" wrote Just my twopenneth, I'm a noob and I'm not going to try such a big file on my old machine, but: 1. Maybe create a *set* from the wordlist, loop through that, so you call "count" on wordlist only once. OR This would be an improvement but still involves traversing the en

Re: [Tutor] JOB AD PROJECT

2010-11-20 Thread delegbede
Hi People, I am afraid only Alan has said something to me. Is it that solution would never come or u guys are waiting to gimme the best? Please help. Sent from my BlackBerry wireless device from MTN -Original Message- From: "Alan Gauld" Sender: tutor-bounces+delegbede=dudupay@pytho