On Thu, Jan 29, 2015 at 02:42:31AM -0500, Robert Simmons wrote: > On Thu, Jan 29, 2015 at 2:29 AM, Roland Smith <[email protected]> wrote: > > On Thu, Jan 29, 2015 at 01:38:21AM -0500, Robert Simmons wrote: > >> I'm having a unicode problem on FreeBSD lang/python34 that does not > >> appear on MacOS X. I've condensed the problem to one single line to > >> enter in the interpreter: > >> > >> FreeBSD: > >> Python 3.4.2 (default, Jan 28 2015, 22:23:57) > >> [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final > >> 208032)] on freebsd10 > >> Type "help", "copyright", "credits" or "license" for more information. > >> >>> b'\xc3\xa2'.decode('utf-8') > >> '\xe2' > >> > >> MacOS X: > >> Python 3.4.2 (default, Oct 19 2014, 17:55:38) > >> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin > >> Type "help", "copyright", "credits" or "license" for more information. > >> >>> b'\xc3\xa2'.decode('utf-8') > >> 'â' > >> > >> Why is Python on FreeBSD incorrectly decoding this? > > > > Works fine here (FreeBSD 10.1-STABLE #0 r276653 amd64): > > > > Python 3.4.2 (default, Nov 4 2014, 19:34:48) > > [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final > > 208032)] on freebsd10 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> b'\xc3\xa2'.decode('utf-8') > > 'â'
(please don't top-post)
> What is the output from print(sys.stdout.encoding) on your system?
Python 3.4.2 (default, Nov 4 2014, 19:34:48)
[GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final
208032)] on freebsd10
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print(sys.stdout.encoding)
UTF-8
> And, can you explain how to change that on mine so that it is UTF-8?
> Mine is a default fresh install, btw.
In /etc/login.conf, I set LC_ALL=en_US.UTF-8;
default:\
:passwd_format=sha512:\
:copyright=/etc/COPYRIGHT:\
:welcome=/etc/motd:\
:setenv=MAIL=/var/mail/$,BLOCKSIZE=K,LC_ALL=en_US.UTF-8:\
:path=/sbin /bin /usr/sbin /usr/bin /usr/games /usr/local/sbin
/usr/local/bin
And I use a unicode aware X terminal (rxvt-unicode).
In case you're not using X11, the new vt(4) device uses UTF-8, but the old
sc(4) doesn't support it at all, AFAIK.
Roland
--
R.F.Smith http://rsmith.home.xs4all.nl/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 5753 3324 1661 B0FE 8D93 FCED 40F6 D5DC A38A 33E0 (keyID: A38A33E0)
pgpp4MY4jhhiJ.pgp
Description: PGP signature
