On Tue, Feb 12, 2013 at 10:03 PM, Maciej Fijalkowski <fij...@gmail.com> wrote: > Hi > > We recently encountered a performance issue in stdlib for pypy. It > turned out that someone commited a performance "fix" that uses += for > strings instead of "".join() that was there before.
Can someone show the actual diff? Of this? I'm making a talk about outdated patterns in Python at DjangoCon EU, prompted by this question, and obsessive avoidance of string concatenation. But all the tests I've done show that ''.join() still is faster or as fast, except when you are joining very few strings, like for example two strings, in which case concatenation is faster or as fast. Both under PyPy and CPython. So I'd like to know in which case ''.hoin() is faster on PyPy and += faster on CPython. Code with times x = 100000 s1 = 'X'* x s2 = 'X'* x for i in xrange(500): s1 += s2 Python 3.3: 0.049 seconds PyPy 1.9: 24.217 seconds PyPy indeed is much much slower than CPython here. But let's look at the join case: x = 100000 s1 = 'X'* x s2 = 'X'* x for i in xrange(500): s1 = ''.join((s1, s2)) Python 3.3: 18.969 seconds PyPy 1.9: 62.539 seconds Here PyPy needs twice the time, and CPython needs 387 times as long time. Both are slower. The best case is of course to make a long list of strings and join them: x = 100000 s1 = 'X'* x s2 = 'X'* x l = [s1] for i in xrange(500): l.append(s2) s1 = ''.join(l) Python 3.3: 0.052 seconds PyPy 1.9: 0.117 seconds That's not always feasible though. //Lennart _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com