Prasad, Ramit wrote:
But more importantly, some years ago (Python 2.4, about 8 years ago?) the
Python developers found a really neat trick that they can do to optimize
string concatenation so it doesn't need to repeatedly copy characters over
and
over and over again. I won't go into details, but the thing is, this trick
works well enough that repeated concatenation is about as fast as the join
method MOST of the time.
I would like to learn a bit more about the trick if you have a
reference handy. I have no intention of using it, but it sounds
interesting and might teach me more about Python internals.
In a nutshell, CPython identifies cases like:
mystr = mystr + otherstr
mystr += otherstr
where mystr is not used in any other place, and if possible, resizes mystr in
place and appends otherstr, rather than copying both to a new string object.
The "if possible" hides a lot of technical detail, which is why the
optimization can fail on some platforms while working on others. See this
painful discussion trying to debug httplib slowness:
http://mail.python.org/pipermail/python-dev/2009-August/091125.html
After many dead-ends and red herrings, somebody spotted the problem:
http://mail.python.org/pipermail/python-dev/2009-September/091582.html
ending with GvR admitting that it was an embarrassment that repeated string
concatenation had survived in the standard library for so long. The author
even knew it was slow because he put a comment warning about it!
Here is the middle of the discussion adding the optimization back in 2004:
http://mail.python.org/pipermail/python-dev/2004-August/046695.html
which talks about the possibility of other implementations doing something
similar. You can find the beginning of the discussion yourself :)
And here is a good description of the optimization itself:
http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt
--
Steven
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor