[issue12784] Concatenation of strings returns the wrong string

2011-08-19 Thread Kåre Krig

New submission from Kåre Krig :

When I concatenate two strings, with the one on the right hand side being 
large, the resulting string is almost correct but has a few chars substituted. 

The following code (with (...) added on the print statement for 3.1) prints 
False on both Python 2.6.5 & 3.1. The file I read is a 20Mb file of text.


inbuff = open('top.test.in')
full_file = inbuff.readlines()
inbuff.close()
data_string = ''.join(full_file)

buff_A = ' ' + data_string
buff_B = ' ' + data_string
print buff_A == buff_B 




I have only been able to test this on one computer, running SUSE. Ram seems 
fine as it passed 15h of memtest. 

Python versions are:

Python 2.6.5 (r265:79063, May  6 2011, 17:25:59) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2

Python 3.1 (r31:73572, Jul  5 2010, 13:31:53) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2

--
components: None
messages: 142445
nosy: Kåre.Krig
priority: normal
severity: normal
status: open
title: Concatenation of strings returns the wrong string
type: behavior
versions: Python 2.6, Python 3.1

___
Python tracker 
<http://bugs.python.org/issue12784>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12784] Concatenation of strings returns the wrong string

2011-08-19 Thread Kåre Krig

Kåre Krig  added the comment:

I tried it again with another file. This time I used the dictionary from 
www.math.sjsu.edu/~foster/dictionary.txt  (~3Mb)

hash(buff_A) == hash(buff_B)  returns False just like the direct comparison. I 
ran the program on dictionary.txt and printed buff_A & buff_B to two different 
files. When running diff on those files the reported differences where:

149668c149668
< intraisland
---
> intrqisland
150052c150052
< invernacular
---
> ynvernacular
230933c230933
< perwitsky
---
> perwitski


For my first run, then immediatly running the same script and doing diff again 
produced another set of differences

253803c253803
< recrown
---
> recrow~
254213c254213
< redisseise
---
> bedisseise
254656c254656
< reflectors
---
> beflectors
255083c255083
< regrating
---
> regratinw


Note how the ascii codes for the faulty characters only differ by one bit, and 
only the 5th least significant bit. This is consistent with my previous tests.

--

___
Python tracker 
<http://bugs.python.org/issue12784>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12784] Concatenation of strings returns the wrong string

2011-08-19 Thread Kåre Krig

Kåre Krig  added the comment:

I managed to get access to another two systems to test this on. One running 
ubuntu & python 2.7.1 and the other suse & python 2.6. I could not reproduce 
the bug on either of those systems.

This all points to the issue not really being a bug in python but something on 
my system.

The fact that I could predictably produce this bug using <20Mb of data, then 
pass 15 hours of memtest86+ and finally produce the bug again makes me think 
it's not the ram system, but there are of course layers between python and the 
ram that might be broken.

--

___
Python tracker 
<http://bugs.python.org/issue12784>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com