Josep M. Fontana wrote:

def countWords(a_list):
    words = {}
    for i in range(len(a_list)):
        item = a_list[i]
        count = a_list.count(item)
        words[item] = count
    return sorted(words.items(), key=lambda item: item[1], reverse=True)
with open('output.txt', 'a') as token_freqs:
    with open('input.txt', 'r') as out_tokens:
        token_list = countWords(out_tokens.read())
        token_freqs.write(token_list)


When you run that code, are you SURE that it merely results in the output file being blank? When I run it, I get an obvious error:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
TypeError: argument 1 must be string or read-only character buffer, not list

Don't you get this error too?


The first problem is that file.write() doesn't take a list as argument, it requires a string. You feed is a list of (word, frequency) pairs. You need to decide how you want to format the output.

The second problem is that you don't actually generate word frequencies, you generate letter frequencies. When you read a file, you get a string, not a list of words. A string is equivalent to a list of letters:

>>> for item in "hello":
...     print(item)
...
h
e
l
l
o


Your countWords function itself is reasonable, apart from some stylistic issues, and some inefficiencies which are unnoticeable for small numbers of words, but will become extremely costly for large lists of words. Ignoring that, here's my suggested code: you might like to look at the difference between what I have written, and what you have, and see if you can tell why I've written what I have.


def countWords(wordlist):
    word_table = {}
    for word in wordlist:
        count = wordlist.count(word)
        word_table[word] = count
    return sorted(
      word_table.items(), key=lambda item: item[1], reverse=True
      )

def getWords(filename):
    with open(filename, 'r') as f:
        words = f.read().split()
    return words

def writeTable(filename, table):
    with open(filename, 'w') as f:
        for word, count in table:
            f.write("%s %s\n" % (word, count))


words = getWords('input.txt')
table = countWords(words)
writeTable('output.txt', table)



For bonus points, you might want to think about why countWords will be so inefficient for large word lists, although you probably won't see any problems until you're dealing with thousands or tens of thousands of words.


--
Steven
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to