Re: Why do Perl programmers make more money than Python programmers
On 6 mai, 09:49, Fábio Santos wrote:
> On 6 May 2013 08:34, "Chris Angelico" wrote:
>
> > Well you see, it was 70 bytes back in the Python 2 days (I'll defer to
> > Steven for data points earlier than that), but with Python 3, there
> > were two versions: one was 140 bytes representing 70 characters, the
> > other 280 bytes representing 70 characters. In Python 3.3, they were
> > merged, and a trivial amount of overhead added, so now it's 80 bytes
> > representing 70 characters. But you have an absolute guarantee that
> > it's correct now.
>
> > Of course, the entire code can be represented as a single int now. You
> > used to have to use a long.
>
> > ChrisA
> > --
>
> Thanks. You have made my day.
>
> I may rise the average pay of a Python programmer in Portugal. I have asked
> for a raise back in December, and was told that it wouldn't happen before
> this year. I have done well. I think I deserve better pay than a
> supermarket employee now. I am sure that my efforts were appreciated and I
> will be rewarded. I am being sarcastic.
>
> The above paragraph wouldn't be true if I programmed in perl, c++ or lisp.
-
1) The memory gain for many of us (usually non ascii users)
just become irrelevant.
>>> sys.getsizeof('maçã')
41
>>> sys.getsizeof('abcd')
29
2) More critical, Py 3.3, just becomes non unicode compliant,
(eg European languages or "ascii" typographers !)
>>> import timeit
>>> timeit.timeit("'abcd'*1000 + 'a'")
2.186670111428325
>>> timeit.timeit("'abcd'*1000 + '€'")
2.9951699820528432
>>> timeit.timeit("'abcd'*1000 + 'œ'")
3.0036780444886233
>>> timeit.timeit("'abcd'*1000 + 'ẞ'")
3.004992278824048
>>> timeit.timeit("'maçã'*1000 + 'œ'")
3.231025618708202
>>> timeit.timeit("'maçã'*1000 + '€'")
3.215894398100758
>>> timeit.timeit("'maçã'*1000 + 'œ'")
3.224407974255655
>>> timeit.timeit("'maçã'*1000 + '’'")
3.2206342273566406
>>> timeit.timeit("'abcd'*1000 + '’'")
2.991440344906
3) Python is "pround" to cover the whole unicode range,
unfortunately it "breaks" the BMP range.
Small GvR exemple (ascii) from the the bug list,
but with non ascii characters.
# Py 3.2, all chars
>>> timeit.repeat("a = 'hundred'; 'x' in a")
[0.09087790617297742, 0.07456871885972305, 0.07449940353376405]
>>> timeit.repeat("a = 'maçãé€ẞ'; 'x' in a")
[0.10088136800095526, 0.07488497003487282, 0.07497594640028638]
# Py 3.3 ascii and non ascii chars
>>> timeit.repeat("a = 'hundred'; 'x' in a")
[0.11426985953005442, 0.10040049292649655, 0.09920834808588097]
>>> timeit.repeat("a = 'maçãé€ẞ'; 'é' in a")
[0.2345595188256766, 0.21637172864154763, 0.2179096624382737]
There are plenty of good reasons to use Python. There are
also plenty of good reasons to not use (or now to drop)
Python and to realize that if you wish to process text
seriously, you are better served by using "corporate
products" or tools using Unicode properly.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Unicode humor
On 8 mai, 15:19, Roy Smith wrote: > Apropos to any of the myriad unicode threads that have been going on > recently: > > http://xkcd.com/1209/ -- This reflects a lack of understanding of Unicode. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: PDF generator decision
On 14 mai, 17:05, Christian Jurk wrote: > Hi folks, > > This questions may be asked several times already, but the development of > relevant software continues day-for-day. For some time now I've been using > xhtml2pdf [1] to generate PDF documents from HTML templates (which are > rendered through my Django-based web application. This have been working for > some time now but I'm constantly adding new templates and they are not > looking like I want it (sometimes bold text is bold, sometimes not, layout > issues, etc). I'd like to use something else than xhtml2pdf. > > So far I'd like to ask which is the (probably) best way to create PDFs in > Python (3)? It is important for me that I am able to specify not only > background graphics, paragaphs, tables and so on but also to specify page > headers/footers. The reason is that I have a bunch of documents to be > generated (including Invoice templates, Quotes - stuff like that). > > Any advice is welcome. Thanks. > > [1]https://github.com/chrisglass/xhtml2pdf - 1) Use Python to collect your data (db, pictures, texts, ...) and/or to create the material (text, graphics, ...) that will be the contents (source) of your your pdf's. 2) Put this source in .tex file (a plain text file). 3) Let it compile with a TeX engine. - I can not figure out something more versatile and basically simple (writing a text file). - Do not forget you are the only one who knows the content and the layout of your document(s). jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Diacretical incensitive search
The handling of diacriticals is especially a nice case
study. One can use it to toy with some specific features of
Unicode, normalisation, decomposition, ...
... and also to show how Unicode can be badly implemented.
First and quick example that came to my mind (Py325 and Py332):
>>> timeit.repeat("ud.normalize('NFKC', ud.normalize('NFKD', 'ᶑḗḖḕḹ'))",
>>> "import unicodedata as ud")
[2.929404406789672, 2.923327801150208, 2.923659417064755]
>>> timeit.repeat("ud.normalize('NFKC', ud.normalize('NFKD', 'ᶑḗḖḕḹ'))",
>>> "import unicodedata as ud")
[3.8437222586746884, 3.829490737203514, 3.819266963414293]
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Harmonic distortion of a input signal
Non sense. The discrete fft algorithm is valid only if the number of data points you transform does correspond to a power of 2 (2**n). Keywords to the problem: apodization, zero filling, convolution product, ... eg. http://en.wikipedia.org/wiki/Convolution jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Harmonic distortion of a input signal
On 20 mai, 19:56, Christian Gollwitzer wrote: > Oops, I thought we were posting to comp.dsp. Nevertheless, I think > numpy.fft does mixed-radix (can't check it now) > > Am 20.05.13 19:50, schrieb Christian Gollwitzer: > > > > > > > > > Am 20.05.13 19:23, schrieb jmfauth: > >> Non sense. > > > Dito. > > >> The discrete fft algorithm is valid only if the number of data > >> points you transform does correspond to a power of 2 (2**n). > > > Where did you get this? The DFT is defined for any integer point number > > the same way. > > > Just if you want to get it fast, you need to worry about the length. For > > powers of two, there is the classic Cooley-Tukey. But there do exist FFT > > algorithms for any other length. For example, there is the Winograd > > transform for a set of small numbers, there is "mixed-radix" to reduce > > any length which can be factored, and there is finally Bluestein which > > works for any size, even for a prime. All of the aforementioned > > algorithms are O(log n) and are implemented in typical FFT packages. All > > of them should result (up to rounding differences) in the same thing as > > the naive DFT sum. Therefore, today > > >> Keywords to the problem: apodization, zero filling, convolution > >> product, ... > > > Not for a periodic signal of integer length. > > >> eg.http://en.wikipedia.org/wiki/Convolution > > > How long do you read this group? > > > Christian -- Forget what I wrote. I'm understanding what I wanted to say, it is badly formulated. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get an integer from a sequence of bytes
On 30 mai, 20:42, Ian Kelly wrote:
> On Thu, May 30, 2013 at 12:26 PM, Mok-Kong Shen
>
> wrote:
> > Am 27.05.2013 17:30, schrieb Ned Batchelder:
>
> >> On 5/27/2013 10:45 AM, Mok-Kong Shen wrote:
>
> >>> From an int one can use to_bytes to get its individual bytes,
> >>> but how can one reconstruct the int from the sequence of bytes?
>
> >> The next thing in the docs after int.to_bytes is int.from_bytes:
> >>http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes
>
> > I am sorry to have overlooked that. But one thing I yet wonder is why
> > there is no direct possibilty of converting a byte to an int in [0,255],
> > i.e. with a constrct int(b), where b is a byte.
>
> The bytes object can be viewed as a sequence of ints. So if b is a
> bytes object of non-zero length, then b[0] is an int in range(0, 256).
Well, Python now "speaks" only "integer", the rest is
commodity and there is a good coherency.
>>> bin(255)
'0b'
>>> oct(255)
'0o377'
>>> 255
255
>>> hex(255)
'0xff'
>>>
>>> int('0b', 2)
255
>>> int('0o377', 8)
255
>>> int('255')
255
>>> int('0xff', 16)
255
>>>
>>> 0b
255
>>> 0o377
255
>>> 255
255
>>> 0xff
255
>>>
>>> type(0b)
>>> type(0o377)
>>> type(255)
>>> type(0xff)
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: python b'...' notation
On 31 mai, 00:19, alcyon wrote:
> On Wednesday, May 29, 2013 3:19:42 PM UTC-7, Cameron Simpson wrote:
> > On 29May2013 13:14, Ian Kelly wrote:
>
> > | On Wed, May 29, 2013 at 12:33 PM, alcyon wrote:
>
> > | > This notation displays hex values except when they are 'printable', in
> > which case it displays that printable character. How do I get it to force
> > hex for all bytes? Thanks, Steve
>
> > |
>
> > | Is this what you want?
>
> > |
>
> > | >>> ''.join('%02x' % x for x in b'hello world')
>
> > | '68656c6c6f20776f726c64'
>
> > Not to forget binascii.hexlify.
>
> > --
>
> > Cameron Simpson
>
> > Every particle continues in its state of rest or uniform motion in a
> > straight
>
> > line except insofar as it doesn't. - Sir Arther Eddington
>
> Thanks for the binascii.hexlify tip. I was able to make it work but I did
> have to write a function to get it exactly the string I wanted. I wanted,
> for example, to display as <0x0A 0x00> or to
> display as <0x21 0xFF 0x28 0xC0>.
>>> a = b'!\xff(\xc0\n\x00'
>>> z = ['0x{:02X}'.format(c) for c in b]
>>> z
['0x21', '0xFF', '0x28', '0xC0', '0x0A', '0x00']
>>> s = ' '.join(z)
>>> s
'0x21 0xFF 0x28 0xC0 0x0A 0x00'
>>>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: PyWart: The problem with "print"
On 2 juin, 20:09, Rick Johnson wrote: > > > > > > I never purposely inject ANY superfluous cycles in my code except in > the case of testing or development. To me it's about professionalism. > Let's consider a thought exercise shall we? > The flexible string representation is the perfect example of this lack of professionalism. Wrong by design, a non understanding of the mathematical logic, of the coding of characters, of Unicode and of the usage of characters (everything is tight together). How is is possible to arrive to such a situation ? The answer if far beyond my understanding (although I have my opinion on the subject). jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
On 5 juin, 19:43, Νικόλαος Κούρας wrote:
> Ôç ÔåôÜñôç, 5 Éïõíßïõ 2013 8:56:36 ð.ì. UTC+3, ï ÷ñÞóôçò Steven D'Aprano
> Ýãñáøå:
>
> Somehow, I don't know how because I didn't see it happen, you have one or
> more files in that directory where the file name as bytes is invalid when
> decoded as UTF-8, but your system is set to use UTF-8. So to fix this you
> need to rename the file using some tool that doesn't care quite so much
> about encodings. Use the bash command line to rename each file in turn
> until the problem goes away.
>
> But renaming ia hsell access like 'mv 'Euxi tou Ihsou.mp3' 'Åõ÷Þ ôïõ
> Éçóïõ.mp3' leade to that unknown encoding of this bytestream
> '\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3'
>
> But please tell me Steven what linux tool you think it can encode the weird
> filename to proper 'Åõ÷Þ ôïõ Éçóïõ.mp3' utf-8?
>
> or we cna write a script as i suggested to decode back the bytestream using
> all sorts of available decode charsets boiling down to the original greek
> letters.
---
see
http://bugs.python.org/issue13643, msg msg149949 - (view) Author:
Antoine Pitrou (pitrou)
Quote:
So, you're complaining about something which works, kind of:
$ touch héhé
$ LANG=C python3 -c "import os; print(os.listdir())"
['h\udcc3\udca9h\udcc3\udca9']
> This makes robustly working with non-ascii filenames on different
> platforms needlessly annoying, given no modern nix should have problems
> just using UTF-8 in these cases.
So why don't these supposedly "modern" systems at least set the
appropriate environment variables for Python to infer the proper
character encoding?
(since these "modern" systems don't have a well-defined encoding...)
Answer: because they are not modern at all, they are antiquated,
inadapted and obsolete pieces of software designed and written by
clueless Anglo-American people. Please report bugs against these
systems. The culprit is not Python, it's the Unix crap and the utterly
clueless attitude of its maintainers ("filesystems are just bytes",
yeah, whatever...).
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish => Greek (subprocess complain)
- A coding scheme works with three sets. A *unique* set of CHARACTERS, a *unique* set of CODE POINTS and a *unique* set of ENCODED CODE POINTS, unicode or not. The relation between the set of characters and the set of the code points is a *human* table, created with a sheet of paper and a pencil, a deliberate choice of characters with integers as "labels". The relation between the set of the code points and the set of encoded code points is a "mathematical" operation. In the case of an "8bits" coding scheme, like iso-XXX, this operation is a no-op, the relation is an identity. Shortly: set of code points == set of encoded code points. In the case of unicode, The Unicode consortium endorses three such mathematical operations called UTF-8, UTF-16 and UTF-32 where UTF means Unicode Transformation Format, a confusing wording meaning at the same time, the process and the result of the process. This Unicode Transformation does not produce bytes, it produces words/chunks/tokens of *bits* with lengths 8, 16, 32, called Unicode Transformation Units (from this the names UTF-8, -16, -32). At this level, only a structure has been defined (there is no computing). Very important, an healthy coding scheme works conceptually only with this *unique" set of encoded code points, not with bytes, characters or code points. The last step, the machine implementation: it is up to the processor, the compiler, the language to implement all these Unicode Transformation Units with of course their related specifities: char, w_char, int, long, endianess, rune (Go language), ... Not too over-simplified or not too over-complicated and enough to understand one, if not THE, design mistake of the flexible string representation. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: A few questiosn about encoding
-- UTF-8, Unicode (consortium): 1 to 4 *Unicode Transformation Unit* UTF-8, ISO 10646: 1 to 6 *Unicode Transformation Unit* (still actual, unless tealy freshly modified) jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python equivalent to the "A" or "a" output conversions in C
On Jun 19, 9:54 pm, "Edward C. Jones" wrote: > On 06/19/2012 12:41 PM, Hemanth H.M wrote: > > > >>> float.hex(x) > > '0x1.5p+3' > > Some days I don't ask the brightest questions. Suppose x was a numpy > floating scalar (types numpy.float16, numpy.float32, numpy.float64, or > numpy.float128). Is there an easy way to write x in > binary or hex? I'm not aware about a buitin fct. May be the module struct — Interpret bytes as packed binary data can help. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
On Jun 20, 1:21 am, Steven D'Aprano wrote: > On Mon, 18 Jun 2012 07:00:01 -0700, jmfauth wrote: > > On 18 juin, 12:11, Steven D'Aprano > [email protected]> wrote: > >> On Mon, 18 Jun 2012 02:30:50 -0700, jmfauth wrote: > >> > On 18 juin, 10:28, Benjamin Kaplan wrote: > >> >> The u prefix is only there to > >> >> make it easier to port a codebase from Python 2 to Python 3. It > >> >> doesn't actually do anything. > > >> > It does. I shew it! > > >> Incorrect. You are assuming that Python 3 input eval's the input like > >> Python 2 does. That is wrong. All you show is that the one-character > >> string "a" is not equal to the four-character string "u'a'", which is > >> hardly a surprise. You wouldn't expect the string "3" to equal the > >> string "int('3')" would you? > > >> -- > >> Steven > > > A string is a string, a "piece of text", period. > > > I do not see why a unicode literal and an (well, I do not know how the > > call it) a "normal class " should behave differently in code source > > or as an answer to an input(). > > They do not. As you showed earlier, in Python 3.3 the literal strings > u'a' and 'a' have the same meaning: both create a one-character string > containing the Unicode letter LOWERCASE-A. > > Note carefully that the quotation marks are not part of the string. They > are delimiters. Python 3.3 allows you to create a string by using > delimiters: > > ' ' > " " > u' ' > u" " > > plus triple-quoted versions of the same. The delimiter is not part of the > string. They are only there to mark the start and end of the string in > source code so that Python can tell the difference between the string "a" > and the variable named "a". > > Note carefully that quotation marks can exist inside strings: > > my_string = "This string has 'quotation marks'." > > The " at the start and end of the string literal are delimiters, not part > of the string, but the internal ' characters *are* part of the string. > > When you read data from a file, or from the keyboard using input(), > Python takes the data and returns a string. You don't need to enter > delimiters, because there is no confusion between a string (all data you > read) and other programming tokens. > > For example: > > py> s = input("Enter a string: ") > Enter a string: 42 > py> print(s, type(s)) > 42 > > Because what I type is automatically a string, I don't need to enclose it > in quotation marks to distinguish it from the integer 42. > > py> s = input("Enter a string: ") > Enter a string: This string has 'quotation marks'. > py> print(s, type(s)) > This string has 'quotation marks'. > > What you type is exactly what you get, no more, no less. > > If you type 42, you get the two character string "42" and not the int 42. > > If you type [1, 2, 3], then you get the nine character string "[1, 2, 3]" > and not a list containing integers 1, 2 and 3. > > If you type 3**0.5 then you get the six character string "3**0.5" and not > the float 1.7320508075688772. > > If you type u'a' then you get the four character string "u'a'" and not > the single character 'a'. > > There is nothing new going on here. The behaviour of input() in Python 3, > and raw_input() in Python 2, has not changed. > > > Should a user write two derived functions? > > > input_for_entering_text() > > and > > input_if_you_are_entering_a_text_as_litteral() > > If you, the programmer, want to force the user to write input in Python > syntax, then yes, you have to write a function to do so. input() is very > simple: it just reads strings exactly as typed. It is up to you to > process those strings however you wish. > > -- > Steven Python 3.3.0a4 (v3.3.0a4:7c51388a3aa7+, May 31 2012, 20:15:21) [MSC v. 1600 32 bit (Intel)] on win32 >>> --- running smidzero.py... ...smidzero has been executed >>> --- input(':') :éléphant 'éléphant' >>> --- input(':') :u'éléphant' 'éléphant' >>> --- input(':') :u'\u00e9l\xe9phant' 'éléphant' >>> --- input(':') :u'\U00e9léphant' 'éléphant' >>> --- input(':') :\U00e9léphant 'éléphant' >>> --- >>> --- # this is expected >>> --- input(':') :b'éléphant' "b'éléphant'" >>> --- len(input(':')) :b'éléphant' 11 --- Good news on the ru''/ur'' front: http://bugs.python.org/issue15096 --- Finally I'm just wondering if this unicode_literal reintroduction is not a bad idea. b'these_are_bytes' u'this_is_a_unicode_string' I wrote all my Py2 code in a "unicode mode" since ... Py2.3 (?). jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
On Jun 20, 11:22 am, Christian Heimes wrote: > Am 18.06.2012 20:45, schrieb Terry Reedy: > > > The simultaneous reintroduction of 'ur', but with a different meaning > > than in 2.7, *was* a problem and it should be removed in the next release. > > FYI:http://hg.python.org/cpython/rev/8e47e9af826e > > Christian I saw this, not the latest version. Anyway, thanks for the info. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
Mea culpa. I had not my head on my shoulders. Inputing if working fine, it returns "text" correctly. However, and this is something different, I'm a little bit surprised, input() does not handle escaped characters (\u, \U). Workaround: encode() and decode() as "raw-unicode-escape". jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: changing sys.path
On 1 fév, 17:15, Andrea Crotti wrote: > So suppose I want to modify the sys.path on the fly before running some code > which imports from one of the modules added. > > at run time I do > sys.path.extend(paths_to_add) > > but it still doesn't work and I get an import error. > > If I take these paths and add them to site-packages/my_paths.pth > everything works, but at run-time the paths which I actually see before > importing are exactly the same. > > So there is something I guess that depends on the order, but what can I > reset/reload to make these paths available (I thought I didn't need > anything in theory)? >>> import mod Traceback (most recent call last): File "", line 1, in ImportError: No module named mod >>> sys.path.append(r'd:\\jm\\junk') >>> import mod >>> mod >>> mod.hello() fct hello in mod.py sys.path? Probably, the most genious Python idea. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: changing sys.path
On 2 fév, 11:03, Andrea Crotti wrote:
> On 02/02/2012 12:51 AM, Steven D'Aprano wrote:
>
>
>
> > On Wed, 01 Feb 2012 17:47:22 +, Andrea Crotti wrote:
>
> >> Yes they are exactly the same, because in that file I just write exactly
> >> the same list,
> >> but when modifying it at run-time it doesn't work, while if at the
> >> application start
> >> there is this file everything works correctly...
>
> >> That's what really puzzles me.. What could that be then?
>
> > Are you using IDLE or WingIDE or some other IDE which may not be
> > honouring sys.path? If so, that's a BAD bug in the IDE.
> > Are you changing the working directory manually, by calling os.chdir? If
> > so, that could be interfering with the import somehow. It shouldn't, but
> > you never know...
>
> > Are you adding absolute paths or relative paths?
>
> No, no and absolute paths..
>
>
>
> > You say that you get an ImportError, but that covers a lot of things
> > going wrong. Here's a story. Could it be correct? I can't tell because
> > you haven't posted the traceback.
>
> > When you set site-packages/my_paths.pth you get a sys path that looks
> > like ['a', 'b', 'fe', 'fi', 'fo', 'fum']. You then call "import spam"
> > which locates b/spam.py and everything works.
>
> > But when you call sys.path.extend(['a', 'b']) you get a path that looks
> > like ['fe', 'fi', 'fo', 'fum', 'a', 'b']. Calling "import spam" locates
> > some left over junk file, fi/spam.py or fi/spam.pyc, which doesn't
> > import, and you get an ImportError.
>
> And no the problem is not that I already checked inspecting at run-time..
> This is the traceback and it might be related to the fact that it runs
> from the
> .exe wrapper generated by setuptools:
>
> Traceback (most recent call last):
> File "c:\python25\scripts\dev_main-script.py", line 8, in
> load_entry_point('psi.devsonly==0.1', 'console_scripts', 'dev_main')()
> File "h:\git_projs\psi\psi.devsonly\psi\devsonly\bin\dev_main.py",
> line 152, in main
> Develer(ns).full_run()
> File "h:\git_projs\psi\psi.devsonly\psi\devsonly\bin\dev_main.py",
> line 86, in full_run
> run(project_name, test_only=self.ns.test_only)
> File "h:\git_projs\psi\psi.devsonly\psi\devsonly\environment.py",
> line 277, in run
> from psi.devsonly.run import Runner
> File "h:\git_projs\psi\psi.devsonly\psi\devsonly\run.py", line 7, in
>
> from psi.workbench.api import Workbench, set_new_dev_main
> ImportError: No module named workbench.api
>
> Another thing which might matter is that I'm launching Envisage
> applications, which
> heavily rely on the use of entry points, so I guess that if something is
> not in the path
> the entry point is not loaded automatically (but it can be forced I
> guess somehow).
>
> I solved in another way now, since I also need to keep a dev_main.pth in
> site-packages
> to make Eclipse happy, just respawning the same process on ImportError works
> already perfectly..
There is something strange here. I can not figure
out how correct code will fail with the sys.path.
It seems to me, the lib you are using is somehow not
able to recognize its own structure ("his own sys.path").
Idea. Are you sure you are modifying the sys.path at
the right place, understand at the right time
when Python processes?
I'm using this sys.path tweaking at run time very often;
eg to test or to run different versions of the same lib
residing in different dirs, and this, in *any* dir and
independently of *any* .pth file.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
There is so much to say on the subject, I do not know where to start. Some points. Today, Sunday, 12 February 2012, 90%, if not more, of the Python applications supposed to work with text and I'm toying with are simply not working. Two reasons: 1) Most of the devs understand nothing or not enough on the field of the coding of the characters. 2) In gui applications, most of the devs understand nothing or not enough in the keyboard keys/chars handling. --- I know Python since version 1.5.2 or 1.5.6 (?). Among the applications I wrote, my fun is in writing GUI interactive interpreters with Python 2 or 3, tkinter, Tkinter, wxPython, PySide, PyQt4 on Windows. Believe or not, my interactive interpreters are the only ones where I can enter text and where text is displayed correctly. IDLE, wxPython/PyShell, DrPython, ... all are failing. (I do not count console applications). Python popularity? I have no popularity-meter. What I know: I can not type French text in IDLE on Windows. It is like this since ~ten years and I never saw any complain about this. (The problem in bad programmation). Ditto for PyShell in wxPython. I do not count, the number of corrections I proposed. In one version, it takes me 18 months until finally decided to propose a correction. During this time, I never heard of the problem. (Now, it is broken again). --- Is there a way to fix this actual status? - Yes, and *very easily*. Will it be fixed? - No, because there is no willingness to solve it. --- Roy Smith's quote: "... that we'll all just be using UTF-32, ..." Considering PEP 393, Python is not taking this road. --- How many devs know, one can not write text in French with the iso-8859-1 coding? (see pep 393) How can one explain, corporates like MS or Apple with their cp1252 or mac-roman codings succeeded to know this? Ditto for foundries (Adobe, LinoType, ...) --- Python is 20 years old. It was developped with ascii in mind. Python was not born, all this stuff was already a no problem with Windows and VB. Even a step higher, Windows was no born, this was a no problem at DOS level (eg TurboPascal), 30 years ago! Design mistake. --- Python 2 introduced the type. Very nice. Problem. The introduction of the automatic coercion ascii-"unicode", which somehow breaks everything. Very bad design mistake. (In my mind, the biggest one). --- One day, I fell on the web on a very old discussion about Python related to the introduction of unicode in Python 2. Something like: Python core dev (it was VS or AP): "... lets go with ucs-4 and we have no problem in the future ...". Look at the situation today. --- And so one. --- Conclusion. A Windows programmer is better served by downloading VB.NET Express. A end Windows user is better served with an application developped with VB.NET Express. I find somehow funny, Python is able to produce this: >>> (1.1).hex() '0x1.1999ap+0' >>> and on the other side, Python, Python applications, are not able to deal correctly with text entering and text displaying. Probably, the two most important tasks a "computer" has to do! jmf PS I'm not a computer scientist, only a computer user. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 13 fév, 04:09, Terry Reedy wrote: > > > * The new internal unicode scheme for 3.3 is pretty much a mixture of > the 3 storage formats (I am of course, skipping some details) by using > the widest one needed for each string. The advantage is avoiding > problems with each of the three. The disadvantage is greater internal > complexity, but that should be hidden from users. They will not need to > care about the internals. They will be able to forget about 'narrow' > versus 'wide' builds and the possible requirement to code differently > for each. There will only be one scheme that works the same on all > platforms. Most apps should require less space and about the same time. > > -- Python 2 was built for ascii users. Now, Python 3(.3) is *optimized* for the ascii users. And the rest of the crowd? Not so sure, French users (among others) who can not write their texts will iso-8859-1/latin1 will be very happy. No doubts, it will work. Is this however the correct approach? jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: format a measurement result and its error in "scientific" way
On 16 fév, 01:18, Daniel Fetchinson wrote: > Hi folks, often times in science one expresses a value (say > 1.03789291) and its error (say 0.00089) in a short way by parentheses > like so: 1.0379(9) > Before swallowing any Python solution, you should realize, the values (value, error) you are using are a non sense : 1.03789291 +/- 0.00089 You express "more precision" in the value than in the error. --- As ex, in a 1.234(5) notation, the "()" is usually used to indicate the accuracy of the digit in "()". Eg 1.345(7) Typographically, the "()" is sometimes replaced by a bold digit ou a subscripted digit. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: format a measurement result and its error in "scientific" way
On 17 fév, 11:03, Daniel Fetchinson wrote: > >> Hi folks, often times in science one expresses a value (say > >> 1.03789291) and its error (say 0.00089) in a short way by parentheses > >> like so: 1.0379(9) > > > Before swallowing any Python solution, you should > > realize, the values (value, error) you are using are > > a non sense : > > > 1.03789291 +/- 0.00089 > > > You express "more precision" in the value than > > in the error. > > My impression is that you didn't understand the original problem: > given an arbitrary value to arbitrary digits and an arbitrary error, > find the relevant number of digits for the value that makes sense for > the given error. So what you call "non sense" is part of the problem > to be solved. > I do not know where these numbers (value, error) are coming from. But, when the value and the error have not the same "precision", there is already something wrong somewhere. And this, *prior* to any representation of these values/numbers. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: distutils bdist_wininst failure on Linux
On 23 fév, 15:06, Steven D'Aprano wrote:
> Following instructions here:
>
> http://docs.python.org/py3k/distutils/builtdist.html#creating-windows...
>
> I am trying to create a Windows installer for a pure-module distribution
> using Python 3.2. I get a "LookupError: unknown encoding: mbcs"
>
> Here is the full output of distutils and the traceback:
>
> [steve@ando pyprimes]$ python3.2 setup.py bdist_wininst
> running bdist_wininst
> running build
> running build_py
> creating build/lib
> copying src/pyprimes.py -> build/lib
> installing to build/bdist.linux-i686/wininst
> running install_lib
> creating build/bdist.linux-i686/wininst
> creating build/bdist.linux-i686/wininst/PURELIB
> copying build/lib/pyprimes.py -> build/bdist.linux-i686/wininst/PURELIB
> running install_egg_info
> Writing build/bdist.linux-i686/wininst/PURELIB/pyprimes-0.1.1a-py3.2.egg-info
> creating '/tmp/tmp3utw4_.zip' and adding '.' to it
> adding 'PURELIB/pyprimes.py'
> adding 'PURELIB/pyprimes-0.1.1a-py3.2.egg-info'
> creating dist
> Warning: Can't read registry to find the necessary compiler setting
> Make sure that Python modules winreg, win32api or win32con are installed.
> Traceback (most recent call last):
> File "setup.py", line 60, in
> "License :: OSI Approved :: MIT License",
> File "/usr/local/lib/python3.2/distutils/core.py", line 148, in setup
> dist.run_commands()
> File "/usr/local/lib/python3.2/distutils/dist.py", line 917, in run_commands
> self.run_command(cmd)
> File "/usr/local/lib/python3.2/distutils/dist.py", line 936, in run_command
> cmd_obj.run()
> File "/usr/local/lib/python3.2/distutils/command/bdist_wininst.py", line
> 179, in run
> self.create_exe(arcname, fullname, self.bitmap)
> File "/usr/local/lib/python3.2/distutils/command/bdist_wininst.py", line
> 262, in create_exe
> cfgdata = cfgdata.encode("mbcs")
> LookupError: unknown encoding: mbcs
>
> How do I fix this, and is it a bug in distutils?
>
> --
> Steven
Because the 'mbcs' codec is missing in your Linux, :-)
>>> 'abc需'.encode('cp1252')
b'abc\xe9\x9c\x80'
>>> 'abc需'.encode('missing')
Traceback (most recent call last):
File "", line 1, in
LookupError: unknown encoding: missing
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python math is off by .000000000000045
>>> (2.0).hex() '0x1.0p+1' >>> (4.0).hex() '0x1.0p+2' >>> (1.5).hex() '0x1.8p+0' >>> (1.1).hex() '0x1.1999ap+0' >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python math is off by .000000000000045
On 25 fév, 23:51, Steven D'Aprano wrote: > On Sat, 25 Feb 2012 13:25:37 -0800, jmfauth wrote: > >>>> (2.0).hex() > > '0x1.0p+1' > >>>> (4.0).hex() > > '0x1.0p+2' > >>>> (1.5).hex() > > '0x1.8p+0' > >>>> (1.1).hex() > > '0x1.1999ap+0' > > > jmf > > What's your point? I'm afraid my crystal ball is out of order and I have > no idea whether you have a question or are just demonstrating your > mastery of copy and paste from the Python interactive interpreter. > It should be enough to indicate the right direction for casual interested readers. -- http://mail.python.org/mailman/listinfo/python-list
On u'Unicode string literals' (Py3)
For those who do not know: The u'' string literal trick has never worked in Python 2. >>> sys.version '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' >>> print u'Un oeuf à zéro EURO uro' Un uf à zéro uro >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: On u'Unicode string literals' reintroduction (Py3)
On 29 fév, 14:45, jmfauth wrote: > For those who do not know: > The u'' string literal trick has never worked in Python 2. > > >>> sys.version > > '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'>>> print > u'Un oeuf à zéro EURO uro' > > Un uf à zéro uro > > > > jmf Sorry, I just wanted to show a small example. I semms Google as "changed" again. You should read (2nd attempt) u'Un œuf à zéro €' with the *correct* typed glyphs 'LATIN SMALL LIGATURE OE' in œuf and 'EURO SIGN' in '€uro'. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between str.isdigit() and str.isdecimal() in Python 3
On 16 mai, 17:48, Marco wrote:
> Hi all, because
>
> "There should be one-- and preferably only one --obvious way to do it",
>
> there should be a difference between the two methods in the subject, but
> I can't find it:
>
> >>> '123'.isdecimal(), '123'.isdigit()
> (True, True)
> >>> print('\u0660123')
> ٠123
> >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()
> (True, True)
> >>> print('\u216B')
> Ⅻ
> >>> '\u216B'.isdecimal(), '\u216B'.isdigit()
> (False, False)
>
> Can anyone give me some help?
> Regards, Marco
It seems to me that it is correct, and the reason lies in this:
>>> import unicodedata as ud
>>> ud.category('\u216b')
'Nl'
>>> ud.category('1')
'Nd'
>>>
>>> # Note
>>> ud.numeric('\u216b')
12.0
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: str.isnumeric and Cuneiforms
On 17 mai, 21:32, Marco wrote: > Is it normal the str.isnumeric() returns False for these Cuneiforms? > > '\U00012456' > '\U00012457' > '\U00012432' > '\U00012433' > > They are all in the Nl category. Indeed there are, but Unicode (ver. 5.0.0) does not assign numeric values to these code points. Do not ask me, why? jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: str.isnumeric and Cuneiforms
On 18 mai, 17:08, Marco Buttu wrote: > On 05/17/2012 09:32 PM, Marco wrote: > > > Is it normal the str.isnumeric() returns False for these Cuneiforms? > > > '\U00012456' > > '\U00012457' > > '\U00012432' > > '\U00012433' > > > They are all in the Nl category. > > > Marco > > It's ok, I found that they don't have a number assigned in > theftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txtdatabase. > -- > Marco Good. I was about to send this information. I have all this (not updated) stuff locally on my hd. -- http://mail.python.org/mailman/listinfo/python-list
Re: str.isnumeric and Cuneiforms
On 18 mai, 17:08, Marco Buttu wrote: > On 05/17/2012 09:32 PM, Marco wrote: > > > Is it normal the str.isnumeric() returns False for these Cuneiforms? > > > '\U00012456' > > '\U00012457' > > '\U00012432' > > '\U00012433' > > > They are all in the Nl category. > > > Marco > > It's ok, I found that they don't have a number assigned in > theftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txtdatabase. > -- > Marco Non official but really practical: http://www.fileformat.info/info/unicode/index.htm -- http://mail.python.org/mailman/listinfo/python-list
Re: python3 raw strings and \u escapes
On 30 mai, 13:54, Thomas Rachel wrote: > Am 30.05.2012 08:52 schrieb [email protected]: > > > > > This breaks a lot of my code because in python 2 > > re.split (ur'[\u3000]', u'A\u3000A') ==> [u'A', u'A'] > > but in python 3 (the result of running 2to3), > > re.split (r'[\u3000]', 'A\u3000A' ) ==> ['A\u3000A'] > > > I can remove the "r" prefix from the regex string but then > > if I have other regex backslash symbols in it, I have to > > double all the other backslashes -- the very thing that > > the r-prefix was invented to avoid. > > > Or I can leave the "r" prefix and replace something like > > r'[ \u3000]' with r'[ ]'. But that is confusing because > > one can't distinguish between the space character and > > the ideographic space character. It also a problem if a > > reader of the code doesn't have a font that can display > > the character. > > > Was there a reason for dropping the lexical processing of > > \u escapes in strings in python3 (other than to add another > > annoyance in a long list of python3 annoyances?) > > Probably it is more consequent. Alas, it makes the whole stuff > incompatible to Py2. > > But if you think about it: why allow for \u if \r, \n etc. are > disallowed as well? > > > And is there no choice for me but to choose between the two > > poor choices I mention above to deal with this problem? > > There is a 3rd one: use r'[ ' + '\u3000' + ']'. Not very nice to read, > but should do the trick... > > Thomas I suggest to take the problem differently. Python 3 succeeded to put order in the missmatch of the "coding of the characters" Python 2 was proposing. In your case, the >>> import unicodedata as ud >>> ud.name('\u3000') 'IDEOGRAPHIC SPACE' "character" (in fact a unicode code point), is just a "character" as a >>> ud.name('a') 'LATIN SMALL LETTER A' The code point / unicode logic, Python 3 proposes and follows, becomes just straightforward. >>> s = 'a\u3000é\u3000€' >>> s.split('\u3000') ['a', 'é', '€'] >>> >>> import re >>> re.split('\u3000', s) ['a', 'é', '€'] The backslash, used as "real backslash", remains what it really was in Python 2. Note, the absence of r'...' . >>> s = 'a\\b\\c' >>> print(s) a\b\c >>> s.split('\\') ['a', 'b', 'c'] >>> re.split('', s) ['a', 'b', 'c'] >>> hex(ord('\\')) '0x5c' >>> re.split('\u005c\u005c', s) ['a', 'b', 'c'] jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: python3 raw strings and \u escapes
On 30 mai, 08:52, "[email protected]" wrote: > In python2, "\u" escapes are processed in raw unicode > strings. That is, ur'\u3000' is a string of length 1 > consisting of the IDEOGRAPHIC SPACE unicode character. > > In python3, "\u" escapes are not processed in raw strings. > r'\u3000' is a string of length 6 consisting of a backslash, > 'u', '3' and three '0' characters. > > This breaks a lot of my code because in python 2 > re.split (ur'[\u3000]', u'A\u3000A') ==> [u'A', u'A'] > but in python 3 (the result of running 2to3), > re.split (r'[\u3000]', 'A\u3000A' ) ==> ['A\u3000A'] > > I can remove the "r" prefix from the regex string but then > if I have other regex backslash symbols in it, I have to > double all the other backslashes -- the very thing that > the r-prefix was invented to avoid. > > Or I can leave the "r" prefix and replace something like > r'[ \u3000]' with r'[ ]'. But that is confusing because > one can't distinguish between the space character and > the ideographic space character. It also a problem if a > reader of the code doesn't have a font that can display > the character. > > Was there a reason for dropping the lexical processing of > \u escapes in strings in python3 (other than to add another > annoyance in a long list of python3 annoyances?) > > And is there no choice for me but to choose between the two > poor choices I mention above to deal with this problem? I suggest to take the problem differently. Python 3 succeeded to put order in the missmatch of the "coding of the characters" Python 2 was proposing. The 'IDEOGRAPHIC SPACE' and 'REVERSE SOLIDUS' (backslash) "characters" (in fact unicode code points) are just (normal) "characters". The backslash, used as an escaping command, keeps its function. Note the absence of r'...' >>> s = 'a\u3000é\u3000€' >>> s.split('\u3000') ['a', 'é', '€'] >>> >>> import re >>> re.split('\u3000', s) ['a', 'é', '€'] >>> s = 'a\\b\\c' >>> print(s) a\b\c >>> s.split('\\') ['a', 'b', 'c'] >>> re.split('', s) ['a', 'b', 'c'] >>> hex(ord('\\')) '0x5c' >>> re.split('\u005c\u005c', s) ['a', 'b', 'c'] jmf -- http://mail.python.org/mailman/listinfo/python-list
Python 3.3.0a4, please add ru'...'
Please consistency. >>> sys.version '3.3.0a4 (v3.3.0a4:7c51388a3aa7+, May 31 2012, 20:15:21) [MSC v.1600 32 bit (Intel)]' >>> 'a' 'a' >>> b'a' b'a' >>> br'a' b'a' >>> rb'a' b'a' >>> u'a' 'a' >>> ur'a' 'a' >>> ru'a' SyntaxError: invalid syntax >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3.3.0a4, please add ru'...'
On 17 juin, 13:30, Christian Heimes wrote: > Am 16.06.2012 19:36, schrieb jmfauth: > > > Please consistency. > > Python 3.3 supports the ur"" syntax just as Python 2.x: > > $ ./python > Python 3.3.0a4+ (default:4c704dc97496, Jun 16 2012, 00:06:09) > [GCC 4.6.3] on linux > Type "help", "copyright", "credits" or "license" for more information.>>> ur"" > > '' > [73917 refs] > > Neither Python 2 nor Python 3 supports ru"". I'm a bit astonished that > rb"" works in Python 3 as it doesn't work in Python 2.7. But br"" works > everywhere. > > Christian I noticed this at the 3.3.0a0 realease. The main motivation for this came from this: http://bugs.python.org/issue13748 PS I saw the dev-list message. PS2 Opinion, if not really useful, consistency nver hurts. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3.3.0a4, please add ru'...'
On 17 juin, 15:48, Christian Heimes wrote: > Am 17.06.2012 14:11, schrieb jmfauth: > > > I noticed this at the 3.3.0a0 realease. > > > The main motivation for this came from this: > >http://bugs.python.org/issue13748 > > > PS I saw the dev-list message. > > > PS2 Opinion, if not really useful, consistency nver hurts. > > We are must likely drop the ur"" syntax as it's not compatible with > Python 2.x's raw unicode notation.http://bugs.python.org/issue15096 > > Christian Yes, but on the other side, "you" (core developers) have reintroduced the messs of the unicode literal, now *assume* it (logiccally). If the core developers have introduced rb'' or br' (Py2)' because they never know if the have to type "rb" or "br" (me too), what a beginner should thing about "ur" and "ru"? Finally, the ultimate argument: what it is Python 3 supposed to be? A Python 2 derivative for lazy (ascii) programmers or an appealing clean and coherent language? jmf -- http://mail.python.org/mailman/listinfo/python-list
Py3.3 unicode literal and input()
What is input() supposed to return?
>>> u'a' == 'a'
True
>>>
>>> r1 = input(':')
:a
>>> r2 = input(':')
:u'a'
>>> r1 == r2
False
>>> type(r1), len(r1)
(, 1)
>>> type(r2), len(r2)
(, 4)
>>>
---
sys.argv?
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
On 18 juin, 10:28, Benjamin Kaplan wrote:
> On Mon, Jun 18, 2012 at 1:19 AM, jmfauth wrote:
> > What is input() supposed to return?
>
> >>>> u'a' == 'a'
> > True
>
> >>>> r1 = input(':')
> > :a
> >>>> r2 = input(':')
> > :u'a'
> >>>> r1 == r2
> > False
> >>>> type(r1), len(r1)
> > (, 1)
> >>>> type(r2), len(r2)
> > (, 4)
>
> > ---
>
> > sys.argv?
>
> > jmf
>
> Python 3 made several backwards-incompatible changes over Python 2.
> First of all, input() in Python 3 is equivalent to raw_input() in
> Python 2. It always returns a string. If you want the equivalent of
> Python 2's input(), eval the result. Second, Python 3 is now unicode
> by default. The "str" class is a unicode string. There is a separate
> bytes class, denoted by b"", for byte strings. The u prefix is only
> there to make it easier to port a codebase from Python 2 to Python 3.
> It doesn't actually do anything.
It does. I shew it!
Related:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/3aefd602507d2fbe#
http://mail.python.org/pipermail/python-dev/2012-June/120341.html
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
On 18 juin, 12:11, Steven D'Aprano wrote:
> On Mon, 18 Jun 2012 02:30:50 -0700, jmfauth wrote:
> > On 18 juin, 10:28, Benjamin Kaplan wrote:
> >> The u prefix is only there to
> >> make it easier to port a codebase from Python 2 to Python 3. It doesn't
> >> actually do anything.
>
> > It does. I shew it!
>
> Incorrect. You are assuming that Python 3 input eval's the input like
> Python 2 does. That is wrong. All you show is that the one-character
> string "a" is not equal to the four-character string "u'a'", which is
> hardly a surprise. You wouldn't expect the string "3" to equal the string
> "int('3')" would you?
>
> --
> Steven
A string is a string, a "piece of text", period.
I do not see why a unicode literal and an (well, I do not
know how the call it) a "normal class " should behave
differently in code source or as an answer to an input().
Should a user write two derived functions?
input_for_entering_text()
and
input_if_you_are_entering_a_text_as_litteral()
---
Side effect from the unicode litteral reintroduction.
I do not mind about this, but I expect it does
work logically and correctly. And it does not.
PS English is not my native language. I never know
to reply to an (interro)-negative sentence.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
Thinks are very clear to me. I wrote enough interactive
interpreters with all available toolkits for Windows
since I know Python (v. 1.5.6).
I do not see why the semantic may vary differently
in code source or in an interactive interpreter,
esp. if Python allow it!
If you have to know by advance what an end user
is supposed to type and/or check it ('str' or unicode
literal) in order to know if the answer has to be
evaluated or not, then it is better to reintroduce
input() and raw_input().
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
We are turning in circles. You are somehow legitimating the reintroduction of unicode literals and I shew, not to say proofed, it may be a source of problems. Typical Python desease. Introduce a problem, then discuss how to solve it, but surely and definitivly do not remove that problem. As far as I know, Python 3.2 is working very well. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Py3.3 unicode literal and input()
On Jun 18, 8:45 pm, Terry Reedy wrote: > On 6/18/2012 12:39 PM, jmfauth wrote: > > > We are turning in circles. > > You are, not we. Please stop. > > > You are somehow legitimating the reintroduction of unicode > > literals > > We are not 'reintroducing' unicode literals. In Python 3, string > literals *are* unicode literals. > > Other developers reintroduced a now meaningless 'u' prefix for the > purpose of helping people write 2&3 code that runs on both Python 2 and > Python 3. Read about it herehttp://python.org/dev/peps/pep-0414/ > > In Python 3.3, 'u' should *only* be used for that purpose and should be > ignored by anyone not writing or editing 2&3 code. If you are not > writing such code, ignore it. > > > and I shew, not to say proofed, it may > > > be a source of problems. > > You are the one making it be a problem. > > > Typical Python desease. Introduce a problem, > > then discuss how to solve it, but surely and > > definitivly do not remove that problem. > > The simultaneous reintroduction of 'ur', but with a different meaning > than in 2.7, *was* a problem and it should be removed in the next release. > > > As far as I know, Python 3.2 is working very > > well. > > Except that many public libraries that we would like to see ported to > Python 3 have not been. The purpose of reintroducing 'u' is to encourage > more porting of Python 2 code. Period. > > -- > Terry Jan Reedy It's a matter of perspective. I expected to have finally a clean Python, the goal is missed. I have nothing to object. It is "your" (core devs) project, not mine. At least, you understood my point of view. I'm a more than two decades TeX user. At the release of XeTeX (a pure unicode TeX-engine), the devs had, like Python2/3, to make anything incompatible. A success. It did not happen a week without seeing a updated package or a refreshed documentation. Luckily for me, Xe(La)TeX is more important than Python. As a scientist, Python is perfect. >From an educational point of view, I'm becoming more and more skeptical about this language, a moving target. Note that I'm not complaining, only "desappointed". jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.7.2
On 12 juin, 19:57, Benjamin Peterson wrote: > On behalf of the Python development team, I'm rosy to announce the immediate > availability of Python 2.7.2. > Small error: The link points to Python 2.7.1. The 2.7.2 page exists: http://www.python.org/download/releases/2.7.2/ Update Python 2.7.2 and 3.1.4 on my win box. Total time < 5mn. Good job. Thanks. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: integer to binary 0-padded
>>> '{:+#0{}b}'.format(255, 1 + 2 + 16)
+0b
>>> '{:+#0{}b}'.format(-255, 1 + 2 + 16)
-0b
>>>
>>> eval('{:+#0{}b}'.format(255, 1 + 2 + 16))
255
>>> eval('{:+#0{}b}'.format(-255, 1 + 2 + 16))
-255
>>>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Unicode codepoints
That seems to me correct.
>>> '\\u{:04x}'.format(ord(u'é'))
\u00e9
>>> '\\U{:08x}'.format(ord(u'é'))
\U00e9
>>>
because
>>> u'\U00e9'
File "", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 0-5: end of string in escape sequence
>>> u'\U00e9'
é
>>> u'\u00e9'
é
>>>
from this:
>>> u'éléphant\N{EURO SIGN}'
éléphant
>>> u = u'éléphant\N{EURO SIGN}'
>>> ''.join(['\\u{:04x}'.format(ord(c)) for c in u])
\u00e9\u006c\u00e9\u0070\u0068\u0061\u006e\u0074\u20ac
>>>
Skipping surrogate pairs is a little bit a non sense,
because the purpose is to display code points!
--
http://mail.python.org/mailman/listinfo/python-list
Re: Unicode codepoints
On 22 juin, 16:07, Saul Spatz wrote: > Thanks very much. This is the elegant kind of solution I was looking for. I > had hoped there was a way to do it without even addressing the matter of > surrogates, but apparently not. The reason I don't like this is that it > depends on knowing that python internally stores strings in UTF-16. I > expected that there would be some built-in iterator that would return the > code points. (Actually, this all started when I realized that s[k] wouldn't > necessarily give me the kth character of the string s.) A character is not a code point. Beside this, a very few knows (correct English?) a character may have more than one code point. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Looking PDF module
On 24 juin, 09:23, Hegedüs Ervin wrote: [...] > Any help would comes well, > > thanks: > I do not really understand the relation Python <-> "this will be a book". For editing purpose, do you need to extract your raw material for somewhere? to create graphics? to parse files? or Are you expecting to compose your book with Python knowing you have the existing material (texts, pictures, ...) and a PDF tool? In the former case, generate a TeX text file (.tex). In the latter case, use TeX. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On 19 juil, 21:09, Terry Reedy wrote:
> On 7/19/2011 2:12 PM, Xah Lee wrote:
>
> >> Also, you may have answered this earlier but I'll ask again anyways: You
> >> ask for the first mismatched pair, Are you referring to the inner most
> >> mismatched, or the outermost? For example, suppose you have this file:
>
> >> foo[(])bar
>
> >> Would the "(" be the first mismatched character or would the "]"?
>
> > yes i haven't been precise. Thanks for brining it up.
>
> > thinking about it now, i think it's a bit hard to define precisely.
>
> Then it is hard to code precisely.
>
Not really. The trick is to count the different opener/closer
separately.
That is what I am doing to check balanced brackets in
chemical formulas. The rules are howerver not the same
as in math.
Interestingly, I fall on this "problem". enumerate() is very
nice to parse a string from left to right.
>>> for i, c in enumerate('abcd'):
... print i, c
...
0 a
1 b
2 c
3 d
>>>
But, if I want to parse a string from right to left,
what's the trick?
The best I found so far:
>>> s = 'abcd'
>>> for i, c in enumerate(reversed(s)):
... print len(s) - 1 - i, c
...
3 d
2 c
1 b
0 a
>>>
--
http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On 20 juil, 09:29, Ian Kelly wrote: > Otherwise, here's another non-DRY solution: > > >>> from itertools import izip > >>> for i, c in izip(reversed(xrange(len(s))), reversed(s)): > > ... > > Unfortunately, this is one space where there just doesn't seem to be a > single obvious way to do it. Well, I see. Thanks. There is still the old, brave solution, I'm in fact using. >>> s = 'abcd' >>> for i in xrange(len(s)-1, -1, -1): ... print i, s[i] ... 3 d 2 c 1 b 0 a >>> --- DRY? acronym for ? -- http://mail.python.org/mailman/listinfo/python-list
Re: How to print non-printable chars??
On 18 août, 22:44, coldpizza wrote:
>
>
> ...
>
> In a web/html environment or in broken ascii-only consoles like the
> one on windows ...
C:\Users\Jean-Michel>echo 'Cet œuf de Lætitia coûte un €uro'
'Cet œuf de Lætitia coûte un €uro'
C:\Users\Jean-Michel>c:\Python27\python
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print 'Cet œuf de Lætitia coûte un €uro'
Cet œuf de Lætitia coûte un €uro
>>> import sys
>>> u = unicode('Cet œuf de Lætitia coûte un €uro', sys.stdin.encoding)
>>> print u.encode(sys.stdout.encoding)
Cet œuf de Lætitia coûte un €uro
>>>
C:\Users\Jean-Michel>c:\Python32\python
Python 3.2.1 (default, Jul 10 2011, 21:51:15) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print('Cet œuf de Lætitia coûte un €uro')
Cet œuf de Lætitia coûte un €uro
>>>
PS Cet œuf de Lætitia coûte un €uro ->
This Lætitia's egg costs one €uro'
PS2 "ñ" does not require special attention.
PS3 To the original question: This not a *coding* issue, it is a
character *representation* question.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Help with regular expression in python
On 19 août, 17:20, Matt Funk wrote:
> Hi,
> thanks for the suggestion. I guess i had found another way around the
> problem as well. But i really wanted to match the line exactly and i
> wanted to know why it doesn't work. That is less for the purpose of
> getting the thing to work but more because it greatly annoys me off that
> i can't figure out why it doesn't work. I.e. why the expression is not
> matches {32} times. I just don't get it.
>
re is not always the right tool to be used.
Without more precisions:
>>> s = '2.201000e+01 2.15e+01 2.15e+01\
... : (instance: 0) : some description'
>>> s
2.201000e+01 2.15e+01 2.15e+01 : (instance: 0) :
some description
>>> s[:s.find(':')]
2.201000e+01 2.15e+01 2.15e+01
>>> s[:s.find(':')].split()
['2.201000e+01', '2.15e+01', '2.15e+01']
>>>
>>>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Help with regular expression in python
On 19 août, 19:33, Matt Funk wrote:
>
> The results obtained are:
> results:
> [(' 2.199000e+01', ' : (instance: 0)\t:\tsome description')]
> so this matches the last number plus the string at the end of the line, but no
> retaining the previous numbers.
>
> Anyway, i think at this point i will go another route. Not sure where the
> issues lies at this point.
>
Seen on this list:
And always keep this in mind:
'Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.'
--Jamie Zawinski, comp.lang.emacs
I proposed a solution which seems to corresponds to your problem
if it were better formulated...
jmf
--
http://mail.python.org/mailman/listinfo/python-list
On re / regex replacement
There is actually a discussion on the dev-list about the replacement
of "re" by "regex".
I'm not a regular expressions specialist, neither a regex user.
However, there is in regex a point that is a little bit disturbing
me.
The regex module proposes a flag to select the "coding" (wrong word,
just to be short):
The global flags are: ASCII, LOCALE, NEW, REVERSE, UNICODE.
If I can undestand the ASCII flag, ASCII being the "lingua franca" of
almost all codings, I am more skeptical about the LOCALE/UNICODE
flags.
There is in my mind some kind of conflict here. What is 100% unicode
compliant shoud be locale independent ("Unicode.org") and a locale
depedency means a loss of unicode compliance.
I'm fearing some potential problems here: Users or modules working
in one mode, while some others are working in the other mode.
Nothing technical here. It seems to me nobody has pointed this
fact.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: On re / regex replacement
On 28 août, 20:40, MRAB wrote: > ... > The regex module tries to be drop-in compatible. It supports the LOCALE > flag only because the re module has it. Even Perl has something similar. > ... Ok. That's quite logical. jmf -- http://mail.python.org/mailman/listinfo/python-list
Representation of floats (-> Mark Dickinson?)
This is just an attempt to put the
http://groups.google.com/group/comp.lang.python/browse_thread/thread/a008af1ac2968833#
discussion at a correct level.
With Python 2.7 a new float number representation (the David Gay's
algorithm)
has been introduced. If this is well honored in Python 2.7, it
seems to me, there are some missmatches in the Py3 series.
>>> sys.version
'2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit
(Intel)]'
>>> 0.1
0.10001
>>> print 0.1
0.1
>>> 1.1 * 1.1
1.2102
>>> print 1.1 * 1.1
1.21
>>> print repr(1.1 * 1.1)
1.2102
>>>
>>> sys.version
2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
>>>
>>> 0.1
0.1
>>> print 0.1
0.1
>>> 1.1 * 1.1
1.21
>>> print 1.1 * 1.1
1.21
>>> print repr(1.1 * 1.1)
1.2102
>>>
>>> sys.version
'3.1.4 (default, Jun 12 2011, 15:05:44) [MSC v.1500 32 bit (Intel)]'
>>> 0.1
0.1
>>> print(0.1)
0.1
>>> 1.1 * 1.1
1.2102
>>> print(1.1 * 1.1)
1.21
>>> print(repr(1.1 * 1.1))
1.2102
>>> '{:g}'.format(1.1 * 1.1)
'1.21'
>>> sys.version
'3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)]'
>>> 0.1
0.1
>>> print(0.1)
0.1
>>> 1.1 * 1.1
1.2102
>>> print (1.1 * 1.1)
1.2102
>>> print(repr((1.1 * 1.1)))
1.2102
>>>
>>> '{:g}'.format(1.1 * 1.1)
'1.21'
>>>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Representation of floats (-> Mark Dickinson?)
On 7 sep, 05:58, casevh wrote: > ... > > Also note that 1.1 * 1.1 is not the same as 1.21. > > >>> (1.1 * 1.1).as_integer_ratio() > > (5449355549118301, 4503599627370496)>>> (1.21).as_integer_ratio() > > (1362338887279575, 1125899906842624) > > This doesn't explain why 2.7.2 displayed a different result on your > computer. What do you get for as_integer_ratio() for (1.1 * 1.1) and > (1.21)? > Sure. I just picked up these numbers/expressions by chance. They came to my mind following the previous discussion. Sticking with the latest versions: >>> sys.version '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' >>> (1.1 * 1.1).as_integer_ratio() (5449355549118301L, 4503599627370496L) >>> (1.21).as_integer_ratio() (1362338887279575L, 1125899906842624L) >>> >>> sys.version '3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)]' >>> (1.1 * 1.1).as_integer_ratio() (5449355549118301, 4503599627370496) >>> (1.21).as_integer_ratio() (1362338887279575, 1125899906842624) >>> Has "long" not disappeared 2.7? I have not the skill to dive into the machinery. I have only some theroretical understanding and I'm a little bit confused and have to face "there something strange somewhere". Test on Windows 7, 32 bits. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Representation of floats (-> Mark Dickinson?)
On 7 sep, 08:56, Mark Dickinson wrote: > On Sep 7, 4:58 am, casevh wrote: > > > IIRC, Python > > 3.2 changed (for floats) __str__ to call __repr__. > > Yes, exactly: str and repr of a float are identical in Python 3.2 + > > I'm also puzzled by the > > 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] > [...]>>> 1.1 * 1.1 > > 1.21 > > in jmf's message. Cut-and-paste typo? > > -- > Mark No. But, it's *my* mistake. I'm using a modified sys.displayhook which uses a print statement (mainly for language reason). If forgot to reset to the initial/default state for these tests when working with too many opened interactive interpreters. Sorry for the noise. >>> sys.version '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' >>> >>> 1.1 * 1.1 1.21 >>> 'éléphant' éléphant >>> >>> sys.displayhook = sys.__displayhook__ >>> 1.1 * 1.1 1.2102 >>> 'éléphant' '\xe9l\xe9phant' >>> sys.version '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Python 3.2 is excellent, but
Well, Python (as 3.2) has never reached this level of excellence, but __pycache__, no, not for me. (I feel better now, after I wrote it.) -- http://mail.python.org/mailman/listinfo/python-list
Re: Egos, heartlessness, and limitations
On 14 avr, 08:59, > Fortunately, if you're using a recent Linux or a Mac with MacPorts, > installing wxPython should never be more than one command line (or half a > dozen clicks) away. Windows users aren't quite so lucky, but still, it's > not like installing it is a major hassle. > Probably, the joke of the day :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python IDE/text-editor
On 16 avr, 05:20, Alec Taylor wrote: > Good Afternoon, > ... Windows user here. I'm using SciTE, http://www.scintilla.org/SciTE.html . Portable (run on/from an usb key), output pane, ... If you are interested in a portable Interactive Interpreter, http://spinecho.ifrance.com/psi.html (run on/from an usb key). -- http://mail.python.org/mailman/listinfo/python-list
Re: PYTHONPATH
On 16 avr, 06:16, harrismh777 wrote: > By default the sys.path always shows the directory python was opened in, > usually the users home directory. With .profile you can set the path > any way you want... most useful for setting up special test directories > ahead of the "real" code, or for setting up separate directories for > versions--- one for Python26, Python27, and of course Python32. > > (there are other ways of accomplishing the same thing, and of course, > this one only really works with *nix systems--- windows is another mess > entirely) > I belong to those who are very happy with the Python installations on Windows platform (thanks MvL, this should be said) and I hope it will continue like this. I do not see any mess here. Every Python version lives in its own isolated directory, including \site-packages. That means I can keep, eg, a Python 2.5 application (*) which is using PIL, wxPython and numpy in a running state, while developping new applications with other Python versions or porting that application (*) to another Python version. And that on all Windows versions (Win2K, XP, Vista, Win7) modulo the underlaying os-libs compatibility, but that's the same problem on all os, especially for the GUI libs. I'm using Python since ver 1.5.6 and I never set any PYTHONPATH environment variable. A final word about sys.path. This is is my mind the most clever idea of Python. I have the feeling, no offense here, you are not understanding it very well. The sys.path is some kind of *dynamic* environment variable and has basically or primarily nothing to do with a user directory. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: detecting newline character
On 23 avr, 22:25, Daniel Geržo wrote: > > Well I am doing this on: > Python 2.7.1 (r271:86832, Mar 7 2011, 14:28:09) > [GCC 4.2.1 (Apple Inc. build 5664)] on darwin > > So what do you guys advise me to do? > > -- Use the io module. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: learnpython.org - an online interactive Python tutorial
On 24 avr, 05:10, harrismh777 wrote: > > I've been giving this some more thought. From the keyboard, all I am > able to enter are character strings (not numbers). Presumably these are > UTF-8 strings in python3. If I enter ... In Python 3, input() returns a unicode, a sequence/table/array of unicode code point(s). No more, no less. Similar to Python 2 where raw_input() returns a sequence/table/array of byte(s). No more, no less. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Deditor
On 27 avr, 19:22, Alec Taylor wrote: > Thanks, any plans for a Windows version? > - Download the deb - Unpack it with a utility like 7zip - Throw away the unnecessary stuff, (keep the "deditor part") - Depending on your libs, adatpt the "import" - Launch deditor.py - Then ... [5 minutes] In fact, this kind of app can be simply packed in a zip file. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Deditor
On 28 avr, 22:16, Kruptein wrote: > On 28 apr, 07:46, jmfauth wrote: > > > > > On 27 avr, 19:22, Alec Taylor wrote: > > > > Thanks, any plans for a Windows version? > > > - Download the deb > > - Unpack it with a utility like 7zip > > - Throw away the unnecessary stuff, (keep the "deditor part") > > - Depending on your libs, adatpt the "import" > > - Launch deditor.py > > - Then ... > > > [5 minutes] > > > In fact, this kind of app can be simply packed in a zip file. > > > jmf > > It isn't that easy as you might have hoped ;) I'm using wxpython for > rendering the GUI somehow some things that work in the linux version > break in the windows version so I need to do some small > modifications and as I'm a hardcore linux fan I ony use windows for > gaming it usually takes a little longer for a windows release, I'm > releasing a tarball now btw :D Sure, it is doable. I have done it (I only tweak the import in such a way, that it does not import modules not installed in my machine, like not importing paramiko). Your application is just a normal application which uses a Python environment, independently from the platform. wxPython does not play something special. Exemple, the wxPython demo can be installed in any dir, even on external drive. PS I have no special interest in deditor, except I like to see what is done with wxPython. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Deditor
On 29 avr, 23:01, Kruptein wrote: > On 29 apr, 20:25, jmfauth wrote: > > > > > On 28 avr, 22:16, Kruptein wrote: > > > > On 28 apr, 07:46, jmfauth wrote: > > > > > On 27 avr, 19:22, Alec Taylor wrote: > > > > > > Thanks, any plans for a Windows version? > > > > > - Download the deb > > > > - Unpack it with a utility like 7zip > > > > - Throw away the unnecessary stuff, (keep the "deditorpart") > > > > - Depending on your libs, adatpt the "import" > > > > - Launchdeditor.py > > > > - Then ... > > > > > [5 minutes] > > > > > In fact, this kind of app can be simply packed in a zip file. > > > > > jmf > > > > It isn't that easy as you might have hoped ;) I'm using wxpython for > > > rendering the GUI somehow some things that work in the linux version > > > break in the windows version so I need to do some small > > > modifications and as I'm a hardcore linux fan I ony use windows for > > > gaming it usually takes a little longer for a windows release, I'm > > > releasing a tarball now btw :D > > > Sure, it is doable. I have done it (I only tweak the > > import in such a way, that it does not import modules > > not installed in my machine, like not importing paramiko). > > > Your application is just a normal application which uses > > a Python environment, independently from the platform. > > > wxPython does not play something special. Exemple, the > > wxPython demo can be installed in any dir, even on external > > drive. > > > PS I have no special interest indeditor, except I like > > to see what is done with wxPython. > > > jmf > > The problem had to do with the configuration panel which displayed > wrong in windows but right in linux. I fixed it and it should now > actually work on both :p > (and the paramiko import error was because I had forgooten to do a try/ > except block somewhere in my plugin management..) > > the windows source zip file is online, Alec can make an installer if > he wants :) Quick tips, hints, "pedagogical" advices: - Distributing a zip or a tarball does not matter. By not distributing a deb and distributing the Py scripts, you just make your app available for everybody (eg. the wxPython demo) - The main problem in your app is not the os. wxPython runs quite smoothly on all platforms. - Critical: I can not enter text and the text is not displayed correctly in the editing part of your app. Once again, this is not an os issue. The wx.stc.StyledTextCtrl is not a simple widget to master. - http://groups.google.com/group/wxpython-users/topics - Best wishes for your project. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode by default
On 12 mai, 18:17, Ian Kelly wrote: > ... > to worry about encodings are when you're encoding unicode characters > to byte strings, or decoding bytes to unicode characters A small but important correction/clarification: In Unicode, "unicode" does not encode a *character*. It encodes a *code point*, a number, the integer associated to the character. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode by default
On 14 mai, 09:41, harrismh777 wrote: > ... > I'm getting much closer here, > ... You should really understand, that Unicode is a domain per se. It is independent from any os's, programming languages or applications. It is up to these tools to be "unicode" compliant. Working in a full unicode mode (at least for texts) is today practically a solved problem. But you have to ensure the whole toolchain is unicode compliant (editors, fonts (OpenType technology), rendering devices, ...). Tip. This list is certainly not the best place to grab informations. I suggest you start by getting informations about XeTeX. XeTeX is the "new" TeX engine working only in a unicode mode. From this starting point, you will fall on plenty web sites speaking about the "unicode world", tools, fonts, ... A variant is to visit sites speaking about *typography*. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: python in school notebooks/laptops
On 30 mai, 13:09, hackingKK wrote: [...] > > Even better, try convincing them to use Ubuntu instead of a virus > called Where I Never Do Operations With Safety, or WINDOWS for short. > That way Python will come by default and VB will be out of question > Happy hacking. > Krishnakant. Do you mean one of these os's, where Python (2) is not working properly because the *defaultencoding* is set to utf-8? jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How do I automate the removal of all non-ascii characters from my code?
On 12 sep, 10:17, Gary Herron wrote: > On 09/12/2011 12:49 AM, Alec Taylor wrote: > > > > > Good evening, > > > I have converted ODT to HTML using LibreOffice Writer, because I want > > to convert from HTML to Creole using python-creole. Unfortunately I > > get this error: "File "Convert to Creole.py", line 17 > > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py > > on line 18, but no encoding declared; see > >http://www.python.org/peps/pep-0263.htmlfor details". > > > Unfortunately I can't post my document yet (it's a research paper I'm > > working on), but I'm sure you'll get the same result if you write up a > > document in LibreOffice Writer and add some End Notes. > > > How do I automate the removal of all non-ascii characters from my code? > > > Thanks for all suggestions, > > > Alec Taylor > The coding of the characters is a domain per se. It is independent from any OS's or applications. When working with (plain) text files, you should always be aware about the coding of the text you are working on. If you are using coding directives, you must ensure your coding directive matches the real coding of the text files. A coding directive is only informative, it does not set the coding. I'm pretty sure, you problem comes from this. There is a mismatch somewhere, you are not aware of. Removing ascii chars is certainly not a valuable solution. It must work. If your are working properly, it can not, not work. Frome a linguistic point of view, the web has informed me Creole (*all the Creoles*) can be composed with the iso-8859-1 coding. That means, iso-8859-1, cp1252 and all Unicode coding variants are possible coding directives. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How do I automate the removal of all non-ascii characters from my code?
On 12 sep, 10:49, Steven D'Aprano wrote: > > Even with a source code encoding, you will probably have problems with > source files including \xe2 and other "bad" chars. Unless they happen to > fall inside a quoted string literal, I would expect to get a SyntaxError. > This is absurd and a complete non sense. The purpose of a coding directive is to inform the engine, which is processing a text file, about the "language" it has to speak. Can be a html, py or tex file. If you have problem, it's probably a mismatch between your coding directive and the real coding of the file. Typical case: ascii/utf-8 without signature. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How do I automate the removal of all non-ascii characters from my code?
On 12 sep, 23:39, "Rhodri James" wrote: > Now read what Steven wrote again. The issue is that the program contains > characters that are syntactically illegal. The "engine" can be perfectly > correctly translating a character as a smart quote or a non breaking space > or an e-umlaut or whatever, but that doesn't make the character legal! > Yes, you are right. I did not understand in that way. However, a small correction/precision. Illegal character do not exit. One can "only" have an ill-formed encoded code points or an illegal encoded code point representing a character/glyph. Basically, in the present case. The issue is most probably a mismatch between the coding directive and the real coding, with "no coding directive" == 'ascii'. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How do I automate the removal of all non-ascii characters from my code?
On 13 sep, 10:15, Steven D'Aprano wrote: The intrinsic coding of the characters is one thing, The usage of bytes stream supposed to represent a text is one another thing, jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: encoding problem with BeautifulSoup - problem when writing parsed text to file
On 6 oct, 06:39, Greg wrote:
> Brilliant! It worked. Thanks!
>
> Here is the final code for those who are struggling with similar
> problems:
>
> ## open and decode file
> # In this case, the encoding comes from the charset argument in a meta
> tag
> # e.g.
> fileObj = open(filePath,"r").read()
> fileContent = fileObj.decode("iso-8859-2")
> fileSoup = BeautifulSoup(fileContent)
>
> ## Do some BeautifulSoup magic and preserve unicode, presume result is
> saved in 'text' ##
>
> ## write extracted text to file
> f = open(outFilePath, 'w')
> f.write(text.encode('utf-8'))
> f.close()
>
or (Python2/Python3)
>>> import io
>>> with io.open('abc.txt', 'r', encoding='iso-8859-2') as f:
... r = f.read()
...
>>> repr(r)
u'a\nb\nc\n'
>>> with io.open('def.txt', 'w', encoding='utf-8-sig') as f:
... t = f.write(r)
...
>>> f.closed
True
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python 2 or 3
On 3 déc, 04:54, Antti J Ylikoski wrote: > Helsinki, Finland, the EU <<< >>> sys.version '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' >>> 'éléphant' '\xe9l\xe9phant' >>> >>> sys.version '3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)]' >>> 'éléphant' 'éléphant' >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: How to support a non-standard encoding?
On 6 jan, 11:03, Ivan wrote: > Dear All > > I'm developing a python application for which I need to support a > non-standard character encoding (specifically ISO 6937/2-1983, Addendum > 1-1989). Here are some of the properties of the encoding and its use in > the application: > > - I need to read and write data to/from files. The file format > includes two sections in different character encodings (so I > shan't be able to use codecs.open()). > > - iso-6937 sections include non-printing control characters > > - iso-6937 is a variable width encoding, e.g. "A" = [41], > "Ä" = [0xC8, 0x41]; all non-spacing diacritical marks are in the > range 0xC0-0xCF. > > By any chance is there anyone out there working on iso-6937? > > Otherwise, I think I need to write a new codec to support reading and > writing this data. Does anyone know of any tutorials or blog posts on > implementing a codec for a non-standard characeter encoding? Would > anyone be interested in reading one? > Take a look at the files, Python modules, in the ...\Lib\encodings. This is the place where all codecs are centralized. Python is magically using these a long there are present in that dir. I remember, long time ago, for the fun, I created such a codec quite easily. I picked up one of the file as template and I modified its "table". It was a byte <-> byte table. For multibytes coding scheme, it may be a litte bit more complicated; you may take a look, eg, at the mbcs.py codec. The distibution of such a codec may be a problem. Another simple approach, os independent. You probably do not write your code in iso-6937, but you only need to encode/decode some bytes sequence "on the fly". In that case, work with bytes, create a couple of coding / decoding functions with a created [*] as helper. It's not so complicate. Use Py2 or Py3 (the recommended way ;-) ) as pivot encoding. [*] I also created once a such a dict from # http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt I never checked if it does correpond to the "official" cp1252 codec. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError in compile
1) If I copy/paste these CJK chars from Google Groups in two of my
interactive
interpreters (no "dos/cmd console"), I have no problem.
>>> import unicodedata as ud
>>> ud.name('工')
'CJK UNIFIED IDEOGRAPH-5DE5'
>>> ud.name('具')
'CJK UNIFIED IDEOGRAPH-5177'
>>> hex(ord(('工')))
'0x5de5'
>>> hex(ord('具'))
'0x5177'
>>>
2) It semms the mbcs codec has some difficulties with
these chars.
>>> '\u5de5'.encode('mbcs')
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'mbcs' codec can't encode characters in position
0--1: invalid character
>>> '\u5de5'.encode('utf-8')
b'\xe5\xb7\xa5'
>>> '\u5de5'.encode('utf-32-be')
b'\x00\x00]\xe5'
3) On the usage of mbcs in files IO interaction --> core devs.
My conclusion.
The bottle neck is on the mbcs side.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError in compile
On 10 jan, 11:53, 8 Dihedral wrote:
> Terry Reedy於 2012年1月10日星期二UTC+8下午4時08分40秒寫道:
>
>
> > I get the same error running 3.2.2 under IDLE but not when pasting into
> > Command Prompt. However, Command Prompt may be cheating by replacing the
> > Chinese chars with '??' upon pasting, so that Python never gets them --
> > whereas they appear just fine in IDLE.
>
> > --
Tested with *my* Windows GUI interactive intepreters.
It seems to me there is a problem with the mbcs codec.
>>> hex(ord('工'))
'0x5de5'
>>> '\u5de5'
'工'
>>> '\u5de5'.encode('mbcs')
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'mbcs' codec can't encode characters in position
0--1: invalid character
>>> '\u5de5'.encode('utf-8')
b'\xe5\xb7\xa5'
>>> '\u5de5'.encode('utf-32-be')
b'\x00\x00]\xe5'
>>> sys.version
'3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)]'
>>> '\u5de5'.encode('mbcs', 'replace')
b'?'
--
>>> u'\u5de5'.encode('mbcs', 'replace')
'?'
>>> repr(u'\u5de5'.encode('utf-8'))
"'\\xe5\\xb7\\xa5'"
>>> repr(u'\u5de5'.encode('utf-32-be'))
"'\\x00\\x00]\\xe5'"
>>> sys.version
'2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError in compile
On 10 jan, 13:28, jmfauth wrote:
Addendum, Python console ("dos box")
D:\>c:\python32\python.exe
Python 3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> '\u5de5'.encode('utf-8')
b'\xe5\xb7\xa5'
>>> '\u5de5'.encode('mbcs')
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'mbcs' codec can't encode characters in position
0--1: inval
id character
>>> ^Z
D:\>c:\python27\python.exe
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u5de5'.encode('utf-8')
'\xe5\xb7\xa5'
>>> u'\u5de5'.encode('mbcs')
'?'
>>> ^Z
D:\>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError in compile
On 11 jan, 01:56, Terry Reedy wrote:
> On 1/10/2012 8:43 AM, jmfauth wrote:
>
>
>
> > D:\>c:\python32\python.exe
> > Python 3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit
> > (Intel)] on win
> > 32
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> '\u5de5'.encode('utf-8')
> > b'\xe5\xb7\xa5'
> >>>> '\u5de5'.encode('mbcs')
> > Traceback (most recent call last):
> > File "", line 1, in
> > UnicodeEncodeError: 'mbcs' codec can't encode characters in position
> > 0--1: inval
> > id character
> > D:\>c:\python27\python.exe
> > Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
> > (Intel)] on win
> > 32
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> u'\u5de5'.encode('utf-8')
> > '\xe5\xb7\xa5'
> >>>> u'\u5de5'.encode('mbcs')
> > '?'
>
> mbcs encodes according to the current codepage. Only the chinese
> codepage(s) can encode the chinese char. So the unicode error is correct
> and 2.7 has a bug in that it is doing "errors='replace'" when it
> supposedly is doing "errors='strict'". The Py3 fix was done
> inhttp://bugs.python.org/issue850997
> 2.7 was intentionally left alone because of back-compatibility
> considerations. (None of this addresses the OP's question.)
>
> --
Ok. I was not aware of this.
PS Prev. post gets lost.
--
http://mail.python.org/mailman/listinfo/python-list
Re: UnicodeEncodeError in compile
On 11 jan, 01:56, Terry Reedy wrote:
> On 1/10/2012 8:43 AM, jmfauth wrote:
>
> ...
>
> mbcs encodes according to the current codepage. Only the chinese
> codepage(s) can encode the chinese char. So the unicode error is correct
> and 2.7 has a bug in that it is doing "errors='replace'" when it
> supposedly is doing "errors='strict'". The Py3 fix was done
> inhttp://bugs.python.org/issue850997
> 2.7 was intentionally left alone because of back-compatibility
> considerations. (None of this addresses the OP's question.)
>
> --
win7, cp1252
Ok. I was not aware of this.
>>> '\N{CYRILLIC SMALL LETTER A}'.encode('mbcs')
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'mbcs' codec can't encode characters in position
0--1: invalid character
>>> '\N{GREEK SMALL LETTER ALPHA}'.encode('mbcs')
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'mbcs' codec can't encode characters in position
0--1: invalid character
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: NaN, Null, and Sorting
On 13 jan, 20:04, Ethan Furman wrote:
> With NaN, it is possible to get a list that will not properly sort:
>
> --> NaN = float('nan')
> --> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN]
> --> sorted(spam)
> [1, 2, nan, 3, nan, 4, 5, 7, nan]
>
> I'm constructing a Null object with the semantics that if the returned
> object is Null, it's actual value is unknown.
>
Short answer.
- NaN != NA()
- I find the actual implementation (Py3.2) quite satisfying. (M.
Dickinson's work)
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: sys.argv as a list of bytes
> > In short: if you need to write "system" scripts on Unix, and you need them > to work reliably, you need to stick with Python 2.x. I think, understanding the coding of the characters helps a bit. I can not figure out how the example below could not be done on other systems. D:\tmp>chcp Page de codes active : 1252 D:\tmp>c:\python32\python.exe sysarg.py a b é € \u0430 \u03b1 z arg: 1 unicode name: LATIN SMALL LETTER A arg: 2 unicode name: LATIN SMALL LETTER B arg: 3 unicode name: LATIN SMALL LETTER E WITH ACUTE arg: 4 unicode name: EURO SIGN arg: 5 unicode name: CYRILLIC SMALL LETTER A arg: 6 unicode name: GREEK SMALL LETTER ALPHA arg: 7 unicode name: LATIN SMALL LETTER Z jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Curious to see alternate approach on a search/replace via regex
On 7 fév, 04:04, Steven D'Aprano wrote:
> On Wed, 06 Feb 2013 13:55:58 -0800, Demian Brecht wrote:
> > Well, an alternative /could/ be:
>
> ...
> py> s = 'http://alongnameofasite1234567.com/q?sports=run&a=1&b=1'
> py> assert u2f(s) == mangle(s)
> py>
> py> from timeit import Timer
> py> setup = 'from __main__ import s, u2f, mangle'
> py> t1 = Timer('mangle(s)', setup)
> py> t2 = Timer('u2f(s)', setup)
> py>
> py> min(t1.repeat(repeat=7))
> 7.2962000370025635
> py> min(t2.repeat(repeat=7))
> 10.981598854064941
> py>
> py> (10.98-7.29)/10.98
> 0.33606557377049184
>
> (Timings done using Python 2.6 on my laptop -- your speeds may vary.)
>
[OT] Sorry, but I find all these "timeit" I see here and there
more and more ridiculous.
Maybe it's the language itself, which became ridiculous.
code:
r = repeat("('WHERE IN THE WORLD IS CARMEN?'*10).lower()")
print('1:', r)
r = repeat("('WHERE IN THE WORLD IS HÉLÈNE?'*10).lower()")
print('2:', r)
t = Timer("re.sub('CARMEN', 'CARMEN', 'WHERE IN THE WORLD IS
CARMEN?'*10)", "import re")
r = t.repeat()
print('3:', r)
t = Timer("re.sub('HÉLÈNE', 'HÉLÈNE', 'WHERE IN THE WORLD IS
HÉLÈNE?'*10)", "import re")
r = t.repeat()
print('4:', r)
result:
>c:\python32\pythonw -u "vitesse3.py"
1: [2.578785478740226, 2.5738459157233833, 2.5739002658825543]
2: [2.57605654937141, 2.5784755252962572, 2.5775366066044896]
3: [11.856728254324088, 11.856321809655501, 11.857456073846905]
4: [12.111787643688231, 12.102743462128771, 12.098514783440208]
>Exit code: 0
>c:\Python33\pythonw -u "vitesse3.py"
1: [0.6063335264470632, 0.6104798922133946, 0.6078580877959869]
2: [4.080205081267272, 4.079303183698418, 4.0786836706522145]
3: [18.093742209318215, 18.07999618095, 18.07107661757692]
4: [18.852576768615222, 18.841418050790622, 18.840745369110437]
>Exit code: 0
The future is bright for ... ascii users.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: string.replace doesn't removes ":"
On 13 fév, 06:26, Rick Johnson wrote:
> On Tuesday, February 12, 2013 10:44:09 PM UTC-6, Rick Johnson wrote:
> >
> > REFERENCES:
> >
> > [1]: Should string.replace handle list, tuple and dict
> > arguments in addition to strings?
>
> > py> string.replace(('a', 'b', 'c'), 'abcdefgabc')
> > 'defg'
> > [...]
>
> And here is a fine example of how a "global function architecture" can
> seriously warp your mind! Let me try that again!
>
> Hypothetical Examples:
>
> py> 'abcdefgabc'.replace(('a', 'b', 'c'), "")
> 'defg'
> py> 'abcdefgabc'.replace(['a', 'b', 'c'], "")
> 'defg'
> py> 'abcdefgabc'.replace({'a':'A', 'b':'2', 'c':'C'})
> 'A2CdefgA2C'
>
> Or, an alternative to passing dict where both old and new arguments accept
> the sequence:
>
> py> d = {'a':'A', 'b':'2', 'c':'C'}
> py> 'abcdefgabc'.replace(d.keys(), d.values())
> 'A2CdefgA2C'
>
> Nice thing about dict is you can control both sub-string and
> replacement-string on a case-by-case basis. But there is going to be a need
> to apply a single replacement string to a sequence of substrings; like the
> null string example provided by the OP.
>
> (hopefully there's no mistakes this time)
>>> d = {ord('a'): 'A', ord('b'): '2', ord('c'): 'C'}
>>> 'abcdefgabc'.translate(d)
'A2CdefgA2C'
>>>
>>>
>>> def jmTranslate(s, table):
... table = {ord(k):table[k] for k in table}
... return s.translate(table)
...
>>> d = {'a': 'A', 'b': '2', 'c': 'C'}
>>> jmTranslate('abcdefgabc', d)
'A2CdefgA2C'
>>> d = {'a': None, 'b': None, 'c': None}
>>> jmTranslate('abcdefgabc', d)
'defg'
>>> d = {'a': '€', 'b': '', 'c': ''}
>>> jmTranslate('abcdefgabc', d)
'€defg€'
>>>
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: string.replace doesn't removes ":"
On 13 fév, 21:24, 8 Dihedral wrote:
> Rick Johnson於 2013年2月14日星期四UTC+8上午12時34分11秒寫道:
>
>
>
>
>
>
>
> > On Wednesday, February 13, 2013 1:10:14 AM UTC-6, jmfauth wrote:
>
> > > >>> d = {ord('a'): 'A', ord('b'): '2', ord('c'): 'C'}
>
> > > >>> 'abcdefgabc'.translate(d)
>
> > > 'A2CdefgA2C'
>
> > > >>> def jmTranslate(s, table):
>
> > > ... table = {ord(k):table[k] for k in table}
>
> > > ... return s.translate(table)
>
> > > ...
>
> > > >>> d = {'a': 'A', 'b': '2', 'c': 'C'}
>
> > > >>> jmTranslate('abcdefgabc', d)
>
> > > 'A2CdefgA2C'
>
> > > >>> d = {'a': None, 'b': None, 'c': None}
>
> > > >>> jmTranslate('abcdefgabc', d)
>
> > > 'defg'
>
> > > >>> d = {'a': '€', 'b': '', 'c': ''}
>
> > > >>> jmTranslate('abcdefgabc', d)
>
> > > '€defg€'
>
> In python the variables of value types, and the variables of lists and
> dictionaries are passed to functions somewhat different.
>
> This should be noticed by any serious programmer in python.
-
The purpose of my quick and dirty fct was to
show it's possible to create a text replacement
fct which is using exclusively text / strings
via a dict. (Even if in my exemple, I'm using
- and can use - None as an empty string !)
You are right.
It is also arguable, that beeing forced to have
to use a number in order to replace a character,
may not be a so good idea.
This should be noticed by any serious language designer.
More seriously.
.translate() is a very nice and underestimated method.
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python Newbie
On 23 fév, 16:43, Steve Simmons wrote: > On 22/02/2013 22:37, [email protected] wrote:> So far I am getting > the impression ... > > My main message to you would be : don't approach Python with a negative > attitude, give it a chance and I'm sure you'll come to enjoy it. > Until you realize this: Py32: >>> timeit.timeit("'abc需'") 0.032749386495456466 >>> sys.getsizeof('abc需') 42 Py33: >>> timeit.timeit("'abc需'") 0.04104208536801017 >>> sys.getsizeof('abc需') 50 Very easy to explain: wrong, incorrect, naive unicode handling. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Newbie
On 23 fév, 20:08, Ethan Furman wrote: > On 02/23/2013 10:44 AM, jmfauth wrote: > > [snip various stupidities] > > > jmf > > Peter, jmfauth is one of our resident trolls. Feel free to ignore him. > > -- > ~Ethan~ Sorry, what can say? More memory and slow down! If you see a progress, I'm seeing a regression. Did you test Devanagari canonical decomposition? Probably not. I did it. I wrote probably more tests than any core developper and tests doing precisely what this flexible representation does (not like the tests I saw). That's the good point of all this story. It is not every day that, one has two implementations of the same product, if one wishes to explain, to teach, to illustrate unicode or the coding of the characters in general. Unicode is not different from the other coding schemes and it behaves exactly in the same way. The solely and basic difference lies in the set of the *characters* which is broader. Unicode, the Consortium, uses the term, "Abstract Character Repertoire". jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Correct handling of case in unicode and regexps
On 23 fév, 15:26, Devin Jeanpierre wrote:
> Hi folks,
>
> I'm pretty unsure of myself when it comes to unicode. As I understand
> it, you're generally supposed to compare things in a case insensitive
> manner by case folding, right? So instead of a.lower() == b.lower()
> (the ASCII way), you do a.casefold() == b.casefold()
>
> However, I'm struggling to figure out how regular expressions should
> treat case. Python's re module doesn't "work properly" to my
> understanding, because:
>
> >>> a = 'ss'
> >>> b = 'ß'
> >>> a.casefold() == b.casefold()
> True
> >>> re.match(re.escape(a), b, re.UNICODE | re.IGNORECASE)
> >>> # oh dear!
>
> In addition, it seems improbable that this ever _could_ work. Because
> if it did work like that, then what would the value be of
> re.match('s', 'ß', re.UNICODE | re.IGNORECASE).end() ? 0.5?
>
> I'd really like to hear the thoughts of people more experienced with
> unicode. What is the ideal correct behavior here? Or do I
> misunderstand things?
-
I'm just wondering if there is a real issue here. After all,
this is only a question of conventions. Unicode has some
conventions, re modules may (has to) use some conventions too.
It seems to me, the safest way is to preprocess the text,
which has to be examinated.
Proposed case study:
How should be ss/ß/SS/ẞ interpreted?
'Richard-Strauss-Straße'
'Richard-Strauss-Strasse'
'RICHARD-STRAUSS-STRASSE'
'RICHARD-STRAUSS-STRAẞE'
There is more or less the same situation with sorting.
Unicode can not do all and it may be mandatory to
preprocess the "input".
Eg. This fct I wrote once for the fun. It sorts French
words (without unicodedata and locale).
>>> import libfrancais
>>> z = ['oeuf', 'œuf', 'od', 'of']
>>> zo = libfrancais.sortedfr(z)
>>> zo
['od', 'oeuf', 'œuf', 'of']
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Nuitka now supports Python 3.2
Fascinating software.
Some are building, some are destroying.
Py33
>>> timeit.repeat("{1:'abc需'}")
[0.2573893570572636, 0.24261832285651508, 0.24259548003601594]
Py323
timeit.repeat("{1:'abc需'}")
[0.11000708521282831, 0.0994753634273593, 0.09901023634051853]
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Nuitka now supports Python 3.2
On 27 fév, 09:21, jmfauth wrote:
>
>
> Fascinating software.
> Some are building, some are destroying.
>
> Py33>>> timeit.repeat("{1:'abc需'}")
>
> [0.2573893570572636, 0.24261832285651508, 0.24259548003601594]
>
> Py323
> timeit.repeat("{1:'abc需'}")
> [0.11000708521282831, 0.0994753634273593, 0.09901023634051853]
>
> jmf
Oops. My bad. (This google).
You should read abc需
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python Speed
On 27 fév, 23:24, Terry Reedy wrote:
> On 2/27/2013 3:21 AM, jmfauth hijacked yet another thread:
> > Some are building, some are destroying.
>
> We are still waiting for you to help build a better 3.3+, instead of
> trying to 'destroy' it with mostly irrelevant cherry-picked benchmarks.
>
> > Py33
> >>>> timeit.repeat("{1:'abc需'}")
> > [0.2573893570572636, 0.24261832285651508, 0.24259548003601594]
>
> On my win system, I get a lower time for this:
> [0.16579443757208878, 0.1475787649924598, 0.14970205670637426]
>
> > Py323
> > timeit.repeat("{1:'abc需'}")
> > [0.11000708521282831, 0.0994753634273593, 0.09901023634051853]
>
> While I get the same time for 3.2.3.
> [0.11759353304428544, 0.0948244802968, 0.09532802044164157]
>
> It seems that something about Jim's machine does not like 3.3.
> *nix will probably see even less of a difference. Times are in
> microseconds, so few programs will ever notice the difference.
>
> In the meanwhile ... Effort was put into reducing startup time for 3.3
> by making sure that every module imported during startup actual needed
> to be imported, and into speeding up imports.
>
> The startup process is getting a deeper inspection for
> 3.4http://python.org/dev/peps/pep-0432/
> 'Simplifying the CPython startup sequence'
> with some expectation for further speedup.
>
> Also, a real-world benchmark project has been
> established.http://speed.python.org/
> Some work has already been done to port benchmarks to 3.x, but I suspect
> there is more to do and more volunteers needed.
>
> --
> Terry Jan Reedy
-
Terry,
As long as you are attempting to work with a "composite" scheme
not working with a unique set of characters, not only it will
not work (properly/with efficiency), it can not work.
This not even a unicode problem. This is true for every coding
scheme. That's why we have, today, all these coding schemes, "coding
scheme": == "set of characters"; != "set of encoded characters".
jmf
--
http://mail.python.org/mailman/listinfo/python-list
Re: Controlling number of zeros of exponent in scientific notation
On 6 mar, 15:03, Roy Smith wrote: > In article , > > [email protected] wrote: > > Instead of: > > > 1.8e-04 > > > I need: > > > 1.8e-004 > > > So two zeros before the 4, instead of the default 1. > > Just out of curiosity, what's the use case here? -- >>> from vecmat6 import * >>> from svdecomp6 import * >>> from vmio6 import * >>> mm = NewMat(3, 2) >>> mm[0][0] = 1.0; mm[0][1] = 2.0e-178 >>> mm[1][0] = 3.0; mm[1][1] = 4.0e-1428 >>> mm[2][0] = 5.0; mm[2][1] = 6.0 >>> pr(mm, 'mm =') mm = ( 1.0e+000 2.0e-178 ) ( 3.0e+000 0.0e+000 ) ( 5.0e+000 6.0e+000 ) >>> aa, vv, bbt = SVDecompFull(mm) >>> pr(aa, 'aa =') aa = ( 3.04128e-001 -8.66366e-002 ) ( 9.12385e-001 -2.59910e-001 ) ( -2.73969e-001 -9.61739e-001 ) >>> pr(bbt, 'bbt =') bbt = ( 7.12974e-001 -7.01190e-001 ) ( -7.01190e-001 -7.12974e-001 ) >>> rr = MatMulMatMulMat(aa, vv, bbt) >>> pr(rr, 'rr =') rr = ( 1.0e+000 -1.38778e-015 ) ( 3.0e+000 -4.44089e-016 ) ( 5.0e+000 6.0e+000 ) >>> jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular expression problem
On 11 mar, 03:06, Terry Reedy wrote: > > ... > By teaching 'speed before correctness", this site promotes bad > programming habits and thinking (and the use of low-level but faster > languages). > ... This is exactly what "your" flexible string representation does! And away from technical aspects, you even succeeded to somehow lose unicode compliance. jmf -- http://mail.python.org/mailman/listinfo/python-list
A reply for rusi (FSR)
As a reply to rusi's comment: http://groups.google.com/group/comp.lang.python/browse_thread/thread/a7689b158fdca29e# >From string creation to the itertools usage. A medley. Some timings. Important: The real/absolute values of these experiments are not important. I do not care and I'm not complaining at all. These values are expected, I expected such values and they are only confirming (*FOR ME*) my understanding of the coding of the characters (and Unicode). #~ py323 py330 #~ test 1: 0.0153577374128190.019290216142579 #~ test 2: 0.0156988016671980.020386269052436 #~ test 3: 0.0156133386842880.018769561472500 #~ test 4: 0.0232352977085290.032253414679390 #~ test 5: 0.0233270621095340.029621391108935 #~ test 6: 1.1199581270767601.095467665651482 #~ test 7: 0.4201584727883110.565518010043673 #~ test 8: 0.6494442346159741.061556978013171 #~ test 9: 0.7123351440720791.211614222458175 #~ test 10: 0.7046229960013571.160909074081441 #~ test 11: 0.6146745849236211.053985430333688 #~ test 12: 0.6603362357927641.059443246081010 #~ test 13: 4.8214359277715705.795325214218677 #~ test 14: 0.4940126682134030.729330462512273 #~ test 15: 0.5048944295857880.879966255906103 #~ test 16: 0.6930933700811031.132884304782264 #~ test 17: 0.7490767437894613.013804437852462 #~ test 18: 7.467055989281286 13.387841650089342 #~ test 19: 7.581776062566778 13.593412812594643 #~ test 20: 9.477877493343140 15.235388291413805 #~ test 21: 0.0226146080261960.020984116094176 #~ test 22: 6.685022041178975 12.687538276191944 #~ test 23: 6.946794763994170 12.986701250949636 #~ test 24: 0.0977968273147600.156285014715777 #~ test 25: 0.0249158071466770.034190706904894 #~ test 26: 0.0249965440660130.032191582014335 #~ test 27: 0.0006939436676840.001315421027272 #~ test 28: 0.0006797654769670.001305968900141 #~ test 29: 0.0016143445481520.025543979763000 #~ test 30: 0.0002040084108120.000286714523313 #~ test 31: 0.0002134605379640.000301286552656 #~ test 32: 0.0002040084108190.000291440586878 #~ test 33: 0.2496929043275390.497374474766957 #~ test 34: 0.2487504484837400.513947598194790 #~ test 35: 0.0998101303960320.249129715085319 jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: String performance regression from python 3.2 to 3.3
-- utf-32 is already here. You are all most probably [*] using it without noticing it. How? By using OpenType fonts, without counting the text processing applications using them. Why? Because there is no other way to do it. [*] depending of the font, the internal table(s), eg "cmap" table, are in utf-16 or utf-32. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Help. HOW TO guide for PyQt installation
On 20 mar, 01:12, "D. Xenakis" wrote: > Hi there, > Im searching for an installation guide for PyQt toolkit. > To be honest im very confused about what steps should i follow for a complete > and clean installation. Should i better choose to install the 32bit or the > 64bit windows version? Or maybe both? Any chance one of them is more/less > bug-crashy than the other? I know both are availiable on the website but just > asking.. If i installed this package on windows 8, should i have any > problems? From what i read PyQt supports only xp and win7. > I was thinking about installing the newer version of PyQt along with the QT5. > I have zero expirience on PyQt so either way, everything is going to be new > to me, so i dont care that much about the learning curve diference between > new and old PyQt - Qt version. I did not find any installer so i guess i > should customly do everything. Any guide for this plz? > > Id also like to ask.. Commercial licence of PyQt can only be bought on > riverbank's website? I think i noticed somewhere an other reseller "cheaper > one" or maybe i didnt know what the hell i was reading :). Maybe something > about Qt and not PyQt. > > Please help this noob, > Regards Short answer without explanation. It does not work. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Help. HOW TO guide for PyQt installation
On 20 mar, 10:30, Phil Thompson wrote: > On Wed, 20 Mar 2013 02:09:06 -0700 (PDT), jmfauth > wrote: > > > > > > > > > > > On 20 mar, 01:12, "D. Xenakis" wrote: > >> Hi there, > >> Im searching for an installation guide for PyQt toolkit. > >> To be honest im very confused about what steps should i follow for a > >> complete and clean installation. Should i better choose to install the > >> 32bit or the 64bit windows version? Or maybe both? Any chance one of > them > >> is more/less bug-crashy than the other? I know both are availiable on > the > >> website but just asking.. If i installed this package on windows 8, > >> should i have any problems? From what i read PyQt supports only xp and > >> win7. > >> I was thinking about installing the newer version of PyQt along with > the > >> QT5. I have zero expirience on PyQt so either way, everything is going > to > >> be new to me, so i dont care that much about the learning curve > diference > >> between new and old PyQt - Qt version. I did not find any installer so > i > >> guess i should customly do everything. Any guide for this plz? > > >> Id also like to ask.. Commercial licence of PyQt can only be bought on > >> riverbank's website? I think i noticed somewhere an other reseller > >> "cheaper one" or maybe i didnt know what the hell i was reading :). > Maybe > >> something about Qt and not PyQt. > > >> Please help this noob, > >> Regards > > > > > > Short answer without explanation. It does not work. > > > jmf > > Well it works for me. Care to elaborate? > > Phil No problem. Yesterday, I downloaded "PyQt4-4.10-gpl-Py3.3-Qt5.0.1-x32-2.exe" and installed it on my Windows 7 Pro box after having removed a previous version. No problem with the installation. I quickly tested it with one of my interactive Python interpreters and got an error "from PyQt4 import QtGui, QtCore" saying, that the DLL cannot be found. Something similar to what Detlev Offenbach reported on the PyQt mailing list. Although, I'm not using Qsci. Strangely, I had not problem (if I recall correctly) with a very basic application (QMainWindow + QLineEdit). I had no problem with the demo (I only lauched it). I did not spend to much time in investigating further. It's the first time I see such an error; usually, no problem. jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: "monty" < "python"
Courageous people can try to do something with the unicode collation algorithm (see unicode.org). Some time ago, for the fun, I wrote something (not perfect) with a reduced keys table (see unicode.org), only a keys subset for some scripts hold in memory. It works with Py32 and Py33. In an attempt to just see the performance and how it "can react", I did an horrible mistake, I forgot Py33 is now optimized for ascii user, it is no more unicode compliant and I stupidely tested/sorted lists of French words... jmf -- http://mail.python.org/mailman/listinfo/python-list
Re: Help. HOW TO guide for PyQt installation
On 20 mar, 11:38, Phil Thompson wrote: > On Wed, 20 Mar 2013 03:29:35 -0700 (PDT), jmfauth > wrote: > > > > > > > > > > > On 20 mar, 10:30, Phil Thompson wrote: > >> On Wed, 20 Mar 2013 02:09:06 -0700 (PDT), jmfauth > >> wrote: > > >> > On 20 mar, 01:12, "D. Xenakis" wrote: > >> >> Hi there, > >> >> Im searching for an installation guide for PyQt toolkit. > >> >> To be honest im very confused about what steps should i follow for a > >> >> complete and clean installation. Should i better choose to install > the > >> >> 32bit or the 64bit windows version? Or maybe both? Any chance one of > >> them > >> >> is more/less bug-crashy than the other? I know both are availiable > on > >> the > >> >> website but just asking.. If i installed this package on windows 8, > >> >> should i have any problems? From what i read PyQt supports only xp > and > >> >> win7. > >> >> I was thinking about installing the newer version of PyQt along with > >> the > >> >> QT5. I have zero expirience on PyQt so either way, everything is > going > >> to > >> >> be new to me, so i dont care that much about the learning curve > >> diference > >> >> between new and old PyQt - Qt version. I did not find any installer > so > >> i > >> >> guess i should customly do everything. Any guide for this plz? > > >> >> Id also like to ask.. Commercial licence of PyQt can only be bought > on > >> >> riverbank's website? I think i noticed somewhere an other reseller > >> >> "cheaper one" or maybe i didnt know what the hell i was reading :). > >> Maybe > >> >> something about Qt and not PyQt. > > >> >> Please help this noob, > >> >> Regards > > >> > > > >> > Short answer without explanation. It does not work. > > >> > jmf > > >> Well it works for me. Care to elaborate? > > >> Phil > > > No problem. > > > Yesterday, I downloaded "PyQt4-4.10-gpl-Py3.3-Qt5.0.1-x32-2.exe" > > and installed it on my Windows 7 Pro box after having removed > > a previous version. > > > No problem with the installation. > > > I quickly tested it with one of my interactive Python interpreters > > and got an error "from PyQt4 import QtGui, QtCore" saying, that the > > DLL cannot be found. > > > Something similar to what Detlev Offenbach reported on > > the PyQt mailing list. Although, I'm not using Qsci. > > > Strangely, I had not problem (if I recall correctly) with a > > very basic application (QMainWindow + QLineEdit). > > > I had no problem with the demo (I only lauched it). > > > I did not spend to much time in investigating further. > > > It's the first time I see such an error; usually, no problem. > > The only time that I've seen a problem like that is when running from a > shell that was started before running the PyQt installer (ie. one with an > out of date PATH). > > Phil -- The PATH could be the cause. I stupidly forgot to check it before removing PyQt... I repeated the experiment (app == eta26.py). With and without "PyQt" in the system PATH. (Btw, why is it necessary?) D:\jm\jmpy\eta\eta26>c:\python32\python eta26.py PyQt: 4.8.6, Qt: 4.7.4 Python 3.2.3 No problem. D:\jm\jmpy\eta\eta26>c:\python33\python eta26.py Traceback (most recent call last): File "eta26.py", line 32, in from PyQt4 import QtGui, QtCore ImportError: DLL load failed: Le module spécifié est introuvable. (Translation: The specified module can no be found.) D:\jm\jmpy\eta\eta26>c:\python33\python eta26.py PyQt: 4.10, Qt: 4.8.4 Python 3.3.0 No problem. No idea. It is mysterious for me. eta26 is only importing QtGui and QtCore. It however uses a sophisticated widget like QPlainTextEdit. jmf -- http://mail.python.org/mailman/listinfo/python-list
