Alan Gauld wrote: > The loop is a bit clunky. it would be clearer just to iterate over > a_list: > > for item in a_list: > words[item] = a_list.count(item)
This is a very inefficient approach because you repeat counting the number of occurrences of a word that appears N times N times: >>> words = {} >>> a_list = "in the garden on the bank behind the tree".split() >>> for word in a_list: ... print "counting", word ... words[word] = a_list.count(word) ... counting in counting the # <-- 1 counting garden counting on counting the # <-- 2 counting bank counting behind counting the # <-- 3 counting tree >>> words {'on': 1, 'garden': 1, 'tree': 1, 'behind': 1, 'in': 1, 'the': 3, 'bank': 1} To avoid the duplicate effort you can check if the word was already counted: >>> words2 = {} >>> for word in a_list: ... if word not in words2: ... print "counting", word ... words2[word] = a_list.count(word) ... counting in counting the counting garden counting on counting bank counting behind counting tree >>> words == words2 True Inside the count() method you are still implicitly iterating over the entire list once for every distinct word. I would instead prefer counting manually while iterating over the list once. This has the advantage that it will even work if you don't keep the whole sequence of words in a list in memory (e. g. if you read them from a file one line at a time): >>> words3 = {} >>> for word in a_list: ... if word in words3: ... words3[word] += 1 ... else: ... words3[word] = 1 ... >>> words3 == words True Finally there's collections.defaultdict or, in Python 2.7, collections.Counter when you are more interested in the result than the way to achieve it. Peter _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor