> A related question is where's the trade-off between using ``in'' with a > list, and a dictionary? I presume that using it with small hashes will > be faster than dictionries since it doesn't have to calculate the > hashes.
Hi Bill, Scanning for an elements in a list is a "linear" operation, in the sense that the time it takes to search is proportional to the size of the list. (Big list == big search time.) Dictionaries have a lookup time that will probably be a constant-time operation, regardless of the size of the collection. (Big dictionary == probably doesn't matter. *grin*) This doesn't mean that dictionaries are always faster than lists: as you know, calculating hash values can take time. But the cost of hashing is usually negligible, since many of Python primitive data types (like strings) automatically cache their hash values too! There's always a possibility, however remote, that the hash function is hideous on the kind of data that you're using. However, Python's Dictionary implementation has been battle-tested and tuned well, so it's probably not a mistake to use dictionaries if they seem like the appropriate data structure. A good book in algorithms will almost always cover the performance characteristics of those two strategies; there are also a lot of good resources on the web about them. NIST has two articles on those two: http://www.nist.gov/dads/HTML/linearSearch.html http://www.nist.gov/dads/HTML/hashtab.html and the excellent Wikipedia has nice readable articles: http://en.wikipedia.org/wiki/Linear_search http://en.wikipedia.org/wiki/Hashtable http://en.wikipedia.org/wiki/Category:Search_algorithms Good luck to you! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor