Steven D'Aprano <steve+pyt...@pearwood.info> added the comment:

I'm afraid I cannot reproduce the problem.

>>> s = '000𐤀'  # \U00010900
>>> s
'000𐤀'
>>> s[0]
'0'
>>> s[1]
'0'
>>> s[2]
'0'
>>> s[3]
'𐤀'
>>> list(s)
['0', '0', '0', '𐤀']


That is using Python 3.9 in the xfce4-terminal. Which xterm are you using?

I am very confident that it is a bug in some external software, possibly the 
xterm, possibly the browser or other application where you copied the 
PHOENICIAN LETTER ALF character from in the first place. It looks like it is 
related to mishandling of the Right-To-Left character:

>>> unicodedata.bidirectional(s[3])
'R'


Using Firefox, when I attempt to select the text s = '000...' in Max's initial 
message with the mouse, the selection highlighting jumps around. See the 
screenshot attached. (selection.png) Depending on how I copy the text, 
sometimes I get '000 ALF' and sometimes '0 ALF 00' which hints that something 
is getting confused by the RTL character, possibly the browser, possible the 
copy/paste clipboard, possibly the terminal. But regardless, I cannot replicate 
the behaviour you show where list(s) is different from indexing the characters 
one by one.

It is very common for applications to mishandle mixed RTL and LTR characters, 
and that can have all sorts of odd display and copy/paste issues.

----------
nosy: +steven.daprano
Added file: https://bugs.python.org/file50260/selection.png

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45105>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to