Re: unicode() vs. s.decode()

Steven D'Aprano Thu, 06 Aug 2009 12:23:54 -0700

On Thu, 06 Aug 2009 20:05:52 +0200, Thorsten Kampe wrote:

> > That is significant! So the winner is:
> > 
> > unicode('äöüÄÖÜß','utf-8')
> 
> Unless you are planning to write a loop that decodes "äöüÄÖÜß" one
> million times, these benchmarks are meaningless.


What if you're writing a loop which takes one million different lines of 
text and decodes them once each?


>>> setup = 'L = ["abc"*(n%100) for n in xrange(1000000)]'
>>> t1 = timeit.Timer('for line in L: line.decode("utf-8")', setup)
>>> t2 = timeit.Timer('for line in L: unicode(line, "utf-8")', setup)
>>> t1.timeit(number=1)
5.6751680374145508
>>> t2.timeit(number=1)
2.6822888851165771


Seems like a pretty meaningful difference to me.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unicode() vs. s.decode()

Reply via email to