[ python-Bugs-1285809 ] re special sequence '\w'

2005-09-09 Thread SourceForge.net
Bugs item #1285809, was opened at 2005-09-09 09:40
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1285809&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: ChristianJ (cybb20)
Assigned to: Nobody/Anonymous (nobody)
Summary: re special sequence '\w' 

Initial Comment:
>>> rexp = re.compile('\w', re.LOCALE)
>>> rexp.findall('_')
['_']
>>> '_'.isalnum()
False

While the Python docs say, that the underscore is 
supported, I strongly ask why this is so? 
The problem is that I want to match a sequence of 
alphanumeric characters but excluding the underscore.
If you defined \w to not support "_" anymore, people 
could easily check for the "_" as well with \w|_ .

My locale is "de_DE" but it does affect other locales as 
well.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1285809&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1283895 ] os.path.abspath() / os.chdir() buggy with unicode paths

2005-09-09 Thread SourceForge.net
Bugs item #1283895, was opened at 2005-09-07 22:30
Message generated for change (Comment added) made by nyamatongwe
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1283895&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Windows
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Nobody/Anonymous (nobody)
Summary: os.path.abspath() / os.chdir() buggy with unicode paths

Initial Comment:
Hi,

Under Windows Explorer, one can create directory names
using characters not belonging to the user locale. For
example, one of our users created a directory named
"C:\Mes Documents\コピー ~ solipsis_svn". 

Unfortunately, when trying to manipulate such a
pathname, os.path.abspath() and os.chdir() don't work
hand in hand. os.path.abspath() uses the garbled
directory name as displayed by the command prompt and
then os.chdir() refuses the path:

C:\>cd "C:\Mes Documents\??? ~ solipsis_svn"

C:\Mes Documents\??? ~ solipsis_svn>python
Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310
32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for
more information.
>>>
>>> import os
>>> os.curdir
'.'
>>> os.path.abspath(os.curdir)
'C:\Mes Documents\??? ~ solipsis_svn'
>>> os.chdir(os.path.abspath(os.curdir))
Traceback (most recent call last):
  File "", line 1, in ?
OSError: [Errno 22] Invalid argument: 'C:\Mes
Documents\??? ~ solipsis_svn'
>>>


--

Comment By: Neil Hodgson (nyamatongwe)
Date: 2005-09-09 23:08

Message:
Logged In: YES 
user_id=12579

This is using byte string arguments causing byte string
processing rather than unicode calls with unicode
processing. Windows code that may encounter file paths
outside the default locale should stick to unicode for
paths. Try converting os.curdir to unicode before calling
other functions:
os.path.abspath(unicode(os.curdir))

--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-09-07 22:36

Message:
Logged In: YES 
user_id=133955

> "C:\Mes Documents\コピー ~
solipsis_svn"

Gasp. Sourceforge escapes HTML entities instead of showing
the real characters... These are Japanese characters, btw.
It's easy to copy/paste some Japanese characters from a Web
site and paste them into Windows Explorer to create a
directory (at least it works with Mozilla Firefox).


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1283895&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1281556 ] exception when unpickling array.array objects

2005-09-09 Thread SourceForge.net
Bugs item #1281556, was opened at 2005-09-04 05:19
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281556&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: John Machin (sjmachin)
Assigned to: Nobody/Anonymous (nobody)
Summary: exception when unpickling array.array objects

Initial Comment:
Note 1: same error for pickle and cPickle
Note 2: pickle.dumps and cPickle.dumps produce 
different results [see below] -- is this expected?

Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 
32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more 
information.
>>> import pickle, cPickle, array
>>> ia = array.array('i',[3,2,1])
>>> ia
array('i', [3, 2, 1])
>>> pia = pickle.dumps(ia, -1)
>>> pia
'\x80\x02carray\narray\nq\x00)\x81q\x01.'
>>> cia = cPickle.dumps(ia, -1)
>>> pia == cia
False
>>> cia
'\x80\x02carray\narray\nq\x01)\x81q\x02.'
>>> pickle.loads(pia)
Traceback (most recent call last):
  File "", line 1, in ?
  File "C:\Python24\lib\pickle.py", line 1394, in loads
return Unpickler(file).load()
  File "C:\Python24\lib\pickle.py", line 872, in load
dispatch[key](self)
  File "C:\Python24\lib\pickle.py", line 1097, in 
load_newobj
obj = cls.__new__(cls, *args)
TypeError: array() takes at least 1 argument (0 given)
>>> pickle.loads(cia)
[same as above]
>>> cPickle.loads(pia)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: array() takes at least 1 argument (0 given)
>>> cPickle.loads(cia)
[same as above]
>>>

--

>Comment By: Raymond Hettinger (rhettinger)
Date: 2005-09-09 14:54

Message:
Logged In: YES 
user_id=80475

Duplicate of 1281383.



--

Comment By: John Machin (sjmachin)
Date: 2005-09-04 05:46

Message:
Logged In: YES 
user_id=480138

Refer bug report 1281383.

Please fix the bug in Python 2.4: if array objects are not
pickleable in 2.4, then pickle and cPickle should raise a
PickleError [like they used to in earlier versions] --
instead of guessing wrongly and misleading callers into
thinking that the objects can be pickled.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281556&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Feature Requests-1285086 ] urllib.quote is too slow

2005-09-09 Thread SourceForge.net
Feature Requests item #1285086, was opened at 2005-09-08 11:37
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1285086&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: Python Library
>Group: None
Status: Open
Resolution: None
>Priority: 2
Submitted By: Tres Seaver (tseaver)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib.quote is too slow

Initial Comment:
'urllib.quote' delegates to '_fast_quote' for the common
case that the user has passed no 'safe' argument.  However,
'_fast_quote' isn't really very fast, especially for
the case that
 it doesn't need to quote anything.

Zope (and presumably other web frameworks) can end up
calling 'quote' dozens, hundreds, even thousands of times
to render a page, which makes this a potentially big win
for them.

I will attach a speed test script which demonstrates the
speed penalty, along with a patch which implements the
speedup.

--

>Comment By: Raymond Hettinger (rhettinger)
Date: 2005-09-09 22:45

Message:
Logged In: YES 
user_id=80475

Checked in a speed-up for Py2.5.
See Lib/urllib.py 1.169.

The check-in provides fast-quoting for all cases (not just
for the default safe argument).  Even the fast path is
quicker.  With translation for both safe and unsafe
characters, it saves len(s) trips through the eval loop,
computes of non-safe replacements just once, and eliminates
the if-logic.  The new table is collision free and has no
failed lookups, so each lookup requires exactly one probe. 
One my machine, timings improved by a factor of two to three
depending on the length of input and number of escaped
characters.

The check-in also simplifies and speeds-up quote_plus() by
using str.replace() instead of a split

Leaving this SF report open because the OP's idea may
possibly provide further improvement -- the checkin itself
was done because it is a clear win over the existing version.

The OP's patch uses regexps to short-circuit when no changes
are needed.  Unless the regexp is cheap and short-circuits
often, the cost of testing will likely exceed the average
amount saved.

Determining whether the regexp is cheaper than the
checked-in version just requires a few timings.  But,
determining the short-circuit percentage requires collecting
statistics from real programs with real data.  For the idea
to be a winner, regexps have to be much faster than the
map/lookup/join step AND the short-circuit case must occur
frequently.

Am lowering the priority until a better patch is received
along with timings and statistical evidence demonstrating a
significant improvement.  Also, reclassifying as a Feature
Request because the existing code is functioning as
documented and passing tests.


--

Comment By: Tres Seaver (tseaver)
Date: 2005-09-08 21:35

Message:
Logged In: YES 
user_id=127625

Note that the speed test script shows equivalent speedups for
both 2.3 and 2.4, ranging from 90% (for the empty string) down
to 73% (for a string with a single character).  The more
"normal"
cases range from 82% to 89% speedups.

--

Comment By: Tres Seaver (tseaver)
Date: 2005-09-08 21:30

Message:
Logged In: YES 
user_id=127625

I'm attaching a patch against 2.4's version

--

Comment By: Jeff Epler (jepler)
Date: 2005-09-08 20:01

Message:
Logged In: YES 
user_id=2772

Tested on Python 2.4.0.  The patch fails on the first chunk
because the list of imports don't match.

The urllib_fast_quote_speed_test.py doesn't run once urllib
has been patched.

I reverted the patch to urllib.py and re-ran.  I got
"faster" values from 0.758 to 0.964.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1285086&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com