[issue12784] Concatenation of strings returns the wrong string
New submission from Kåre Krig : When I concatenate two strings, with the one on the right hand side being large, the resulting string is almost correct but has a few chars substituted. The following code (with (...) added on the print statement for 3.1) prints False on both Python 2.6.5 & 3.1. The file I read is a 20Mb file of text. inbuff = open('top.test.in') full_file = inbuff.readlines() inbuff.close() data_string = ''.join(full_file) buff_A = ' ' + data_string buff_B = ' ' + data_string print buff_A == buff_B I have only been able to test this on one computer, running SUSE. Ram seems fine as it passed 15h of memtest. Python versions are: Python 2.6.5 (r265:79063, May 6 2011, 17:25:59) [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2 Python 3.1 (r31:73572, Jul 5 2010, 13:31:53) [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2 -- components: None messages: 142445 nosy: Kåre.Krig priority: normal severity: normal status: open title: Concatenation of strings returns the wrong string type: behavior versions: Python 2.6, Python 3.1 ___ Python tracker <http://bugs.python.org/issue12784> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12784] Concatenation of strings returns the wrong string
Kåre Krig added the comment: I tried it again with another file. This time I used the dictionary from www.math.sjsu.edu/~foster/dictionary.txt (~3Mb) hash(buff_A) == hash(buff_B) returns False just like the direct comparison. I ran the program on dictionary.txt and printed buff_A & buff_B to two different files. When running diff on those files the reported differences where: 149668c149668 < intraisland --- > intrqisland 150052c150052 < invernacular --- > ynvernacular 230933c230933 < perwitsky --- > perwitski For my first run, then immediatly running the same script and doing diff again produced another set of differences 253803c253803 < recrown --- > recrow~ 254213c254213 < redisseise --- > bedisseise 254656c254656 < reflectors --- > beflectors 255083c255083 < regrating --- > regratinw Note how the ascii codes for the faulty characters only differ by one bit, and only the 5th least significant bit. This is consistent with my previous tests. -- ___ Python tracker <http://bugs.python.org/issue12784> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12784] Concatenation of strings returns the wrong string
Kåre Krig added the comment: I managed to get access to another two systems to test this on. One running ubuntu & python 2.7.1 and the other suse & python 2.6. I could not reproduce the bug on either of those systems. This all points to the issue not really being a bug in python but something on my system. The fact that I could predictably produce this bug using <20Mb of data, then pass 15 hours of memtest86+ and finally produce the bug again makes me think it's not the ram system, but there are of course layers between python and the ram that might be broken. -- ___ Python tracker <http://bugs.python.org/issue12784> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com