On Sat, 12 Jun 2010 02:18:52 am Hugo Arts wrote: > On 11 jun 2010, at 17:49, Steven D'Aprano <st...@pearwood.info> wrote: > > On Sat, 12 Jun 2010 12:58:19 am Alan Gauld wrote: > >> Have you looked at the count method of lists? > >> > >> Something like: > >> > >> counts = set(( item, mylist.count(item)) for item in mylist if > >> mylist.count(item) > 1) > > > > That's a Shlemiel the Painter algorithm. > > > > http://www.joelonsoftware.com/articles/fog0000000319.html > > It is, but it's also very elegant and simple to understand.
Your idea of elegant and simple is not the same as mine. To me, I see unnecessary work and unneeded complexity. A generator expression for a beginner having trouble with the basics? Calling mylist.count(item) *twice* for every item?! How is that possibly elegant? > And it > works fine on small inputs. Everything is fast for small n. [pedantic] Not everything. Some things are inherently slow, for example: http://en.wikipedia.org/wiki/Busy_beaver A five-state Busy Beaver machine takes 47,176,870 steps to return, and a six-state Busy Beaver takes 10**21132 steps to return. A slightly less extreme example is Ackermann's Function: http://en.wikipedia.org/wiki/Ackermann's_function where (for example) A(4,2) has over 19,000 digits. Calculating the number of recursive steps needed to calculate that value is left as an exercise. [/pedantic] These are extreme examples, but the point is that there are tasks which are hard even for small N. "Find N needles in a haystack" remains hard even for N=1. (And before you suggest using a magnet, it's a *plastic* needle.) [...] > I guess what I'm trying to say is: using code that performs bad in > situations that won't be encountered anyway is not inherently bad. The problem is that situations that won't be encountered often *are* encountered, long after the coder who introduced the poorly-performing algorithm has moved on. Here's a real-life example, from Python itself: last September, on the Python-Dev mailing list, Chris Withers reported a problem downloading a file using Python's httplib module. For a file that took wget or Internet Explorer approximately 2 seconds to download, it took Python up to thirty MINUTES -- that's nearly a thousand times slower. Eventually Chris, with the help of others on the list, debugged the problem down to a Shlemiel algorithm in the httplib code. Somebody found a comment in the _read_chunked method that said: # XXX This accumulates chunks by repeated string concatenation, # which is not efficient as the number or size of chunks gets big. So somebody used an algorithm which they KNEW was inefficient and slow, it had been there for years, affecting who knows how many people, until eventually somebody was annoyed sufficiently to do something about it. And the sad thing is that avoiding repeated string concatenation is easy and there was no need for it in the first place. > Otherwise, we'd all still be writing everything in C. You've missed the point. It's not the language, you can write poorly performing code in any language, and an O(N) algorithm in a slow language will probably out-perform an O(N**2) or O(2**N) algorithm in a fast language. -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor