[issue47203] ImportError: DLL load failed while importing binascii: %1 is not a valid Win32 application.

2022-04-04 Thread Matthew


Matthew  added the comment:

Hello,

Thanks for all the help people have given me! I've found the solution to my 
problem. The Environment Variable was set below every other, leading to a 
different Python interpreter to being used, which was probably bundled with a 
different software. I moved the Env. Variable up to the top, and the issue was 
fixed.

Thanks again!

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue47203>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47203] ImportError: DLL load failed while importing binascii: %1 is not a valid Win32 application.

2022-04-05 Thread Matthew


Matthew  added the comment:

> Probably there was also shadowing involved, since the built-in module doesn't 
> try to load anything else. Would be nice to know for sure (@Matthew) to make 
> sure we don't have some other issue here, but you're right, I don't see any 
> way for this to happen without other causes.

I'm pretty sure the Python interpreter that was causing the issue was bundled 
with the MSYS2 Mingw64 compiler. I tried reproducing the bug, but I've recently 
reinstalled the compiler due to some issues I was having with it, and the bug 
with importing binascii is no longer present.

--

___
Python tracker 
<https://bugs.python.org/issue47203>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39116] StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop

2020-11-28 Thread Matthew


Matthew  added the comment:

Let me preface this by declaring that I am very new to Python async so it is 
very possible that I am missing something seemingly obvious. That being said, 
I've been looking at various resources to try to understand the internals of 
asyncio and it hasn't led to any insights on this problem thus far.
-

This all sounds quite similar to an experience I am dealing with. I'm working 
with pub sub within aioredis which internally uses a StreamReader with a 
function equivalent to readexactly. This all started from debugging "Task was 
destroyed but it is pending!" to which attempted fixes led to multiple 
"RuntimeError: aclose(): asynchronous generator is already running" errors.

I did the same thing, adding try excepts everywhere in my code to understand 
what was happening and this led me to identifying that a regular async function 
would raise GeneratorExit during await. However, even if I suppress this, the 
caller awaiting on this function would also raise a GeneratorExit. Suppressing 
this exception at the top level leads to an unsuspecting (to me) error 
"coroutine ignored GeneratorExit".

I understand that GeneratorExit is raised in unfinished generators when garbage 
collected to handle cleanup. And I understand that async functions are 
essentially a generator in the sense that they yield when they await. So, if 
the entire coroutine were garbage collected this might trigger GeneratorExit in 
each nested coroutine. However, from all of my logging I am sure that prior to 
the GeneratorExit, nothing returns  upwards so there should still be valid 
references to every object.

I'll include some errors below, in case they may be of relevance:

=== Exception in await of inner async function ===
Traceback (most recent call last):
  File ".../site-packages/uvicorn/protocols/http/httptools_impl.py", line 165, 
in data_received
self.parser.feed_data(data)
  File "httptools/parser/parser.pyx", line 196, in 
httptools.parser.parser.HttpParser.feed_data
httptools.parser.errors.HttpParserUpgrade: 858

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../my_code.py", line 199, in wait_for_update
return await self.waiter.wait_for_value()
GeneratorExit

=== Exception when suppressing GeneratorExit on the top level ===
Exception ignored in: 
Traceback (most recent call last):
  File ".../site-packages/websockets/protocol.py", line 229, in __init__
self.reader = asyncio.StreamReader(limit=read_limit // 2, loop=loop)
RuntimeError: coroutine ignored GeneratorExit

--
nosy: +matthew

___
Python tracker 
<https://bugs.python.org/issue39116>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36893] email.headerregistry.Address blocks Unicode local part addr_spec accepted elsewhere

2019-05-12 Thread Matthew

New submission from Matthew :

The parser for passing an addr_spec to email.headerregistry.Address does not 
allow non-ASCII local parts, but the rest of the email package handles them 
fine, either straight (with explicit references to RFC6532 and SMTPUTF8), or 
encoding as expected. Apologies if I've misunderstood something.

>>> from email.message import EmailMessage
>>> msg = EmailMessage()
>>> msg['To'] = 'Matthéw '
>>> msg.as_string()
'To: =?utf-8?q?Matth=C3=A9w?= <=?utf-8?q?a=C3=A9?=@example.com>\n\n'
>>> msg['To'].addresses[0]
Address(display_name='Matthéw', username='aé', domain='example.com')
>>> msg['To'].addresses[0].addr_spec
'aé@example.com'
>>> email.headerregistry.Address(addr_spec=msg['To'].addresses[0].addr_spec)
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py",
 line 48, in __init__
raise a_s.all_defects[0]
email.errors.NonASCIILocalPartDefect: local-part contains non-ASCII characters)
>>>

--
components: email
messages: 342254
nosy: barry, dracos, r.david.murray
priority: normal
severity: normal
status: open
title: email.headerregistry.Address blocks Unicode local part addr_spec 
accepted elsewhere
versions: Python 3.6

___
Python tracker 
<https://bugs.python.org/issue36893>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6676] expat parser throws Memory Error when parsing multiple files

2009-08-10 Thread Matthew

New submission from Matthew :

I'm using the Expat python interface to parse multiple XML files in an
application and have found that it throws a "Memory Error" exception if
multiple calls are made to xmlparser.ParseFile(file) on the same
xmlparser object. This occurs even with a vanilla xmlparser object
created with xml.parsers.expat.ParserCreate().

Python Version: 2.6.2
Operating System: Ubuntu

--
components: XML
files: expat-error.py
messages: 91452
nosy: realpolitik
severity: normal
status: open
title: expat parser throws Memory Error when parsing multiple files
type: behavior
versions: Python 2.6
Added file: http://bugs.python.org/file14684/expat-error.py

___
Python tracker 
<http://bugs.python.org/issue6676>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6676] expat parser throws Memory Error when parsing multiple files

2009-08-10 Thread Matthew

Matthew  added the comment:

This also occurs with Python 2.5.1 on OS X

--

___
Python tracker 
<http://bugs.python.org/issue6676>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

New submission from Matthew :

What I intended was...
I create a list of DIFFERENT instances of the same class, I wanted them
to be different instances, with different values for the properties,
stressing the word "DIFFERENT".

What I originally did was...
The __init__ assigns default values for the properties (eg, iId = 0, and
sName = ''), then I would change to properties before adding it to the
list.  However, the list will contain the right number of elements, but
every element is the same instance of the class.

I resolved this by...
If I change __init__ such that I'm passing it parameters with values to
assign to the properties.  And then adding the instances of the class to
the list.  Then each element in the list is a different instance.  And
this made everything work.

--
components: Windows
messages: 88340
nosy: mbaynham
severity: normal
status: open
title: Dupicate instances of classes in list
versions: Python 2.5

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

I didn't keep a copy of the code that didn't work.  Sorry.

When I changed the way I was initialising the classes, before adding
them to the list, I didn't change any of the logical flow in my code,
and it started to work.

I know it sounds very strange, like it really shouldn't happen, but it did.

--

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

Trust me this was no typo.

I debugged my code by adding print statements to see what values were
going into the list, and print statements to see the values that were
coming out.

It might be that running code from the application Blender does bazaar
things, or it could be anything, I don't know.  All I know is all the
instances I had in the list, had the same property values as the last
element that was added to the list.  I know it's amazingly weird.

--
status: closed -> open

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

First File


This is the main one, it's the one that is called from the Blender
application.

--
Added file: http://bugs.python.org/file14074/MyWalls.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

Second file

The loads of code for building a wall

--
Added file: http://bugs.python.org/file14075/WallWithDoors.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

Third file...

all the classes for holding the data

--
Added file: http://bugs.python.org/file14076/modDataObjects.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

Forth and final file...

Just a little error handling and stuff I want all my classes to inherit

--
Added file: http://bugs.python.org/file14077/modBasics.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

If you want to run it I'm afraid you'll have to:
1) install Blender (www.blender.org)
2) put all those files in C:\Program Files\Blender
Foundation\Blender\.blender\scripts\MyWalls\src\
3) in the Blender application goto scripts ---> Objects ---> Build Walls

Then look at the terminal window that comes with Blender and you'll see
it's just gone wrong.

And if you change the Class ClsFeatureVariables, so that the properties
are set when it initialises you'll see it all works OK.

--

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

So why is it able to create instances and add them to the
lstFeatureVariables list, surely it should go wrong there and not allow
the instances to be created.

--

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Matthew  added the comment:

OK, sorry for the rambling, just ignore it

Cheers...

--

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Changes by Matthew :


Removed file: http://bugs.python.org/file14074/MyWalls.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Changes by Matthew :


Removed file: http://bugs.python.org/file14075/WallWithDoors.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Changes by Matthew :


Removed file: http://bugs.python.org/file14076/modDataObjects.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6113] Dupicate instances of classes in list

2009-05-26 Thread Matthew

Changes by Matthew :


Removed file: http://bugs.python.org/file14077/modBasics.py

___
Python tracker 
<http://bugs.python.org/issue6113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43224] Add support for PEP 646

2022-01-04 Thread Matthew Rahtz


Change by Matthew Rahtz :


--
components: +Parser, Tests
nosy: +lys.nikolaou, pablogsal
title: Add support for PEP 646 (Variadic Generics) to typing.py -> Add support 
for PEP 646
versions: +Python 3.11 -Python 3.10

___
Python tracker 
<https://bugs.python.org/issue43224>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43224] Add support for PEP 646

2022-01-04 Thread Matthew Rahtz


Change by Matthew Rahtz :


--
pull_requests: +28607
pull_request: https://github.com/python/cpython/pull/30398

___
Python tracker 
<https://bugs.python.org/issue43224>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46410] TypeError when parsing regexp with unicode named character sequence escape

2022-01-18 Thread Matthew Barnett


Matthew Barnett  added the comment:

They're not supported in string literals either:

Python 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit 
(AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> "\N{KEYCAP NUMBER SIGN}"
  File "", line 1
"\N{KEYCAP NUMBER SIGN}"
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in 
position 0-21: unknown Unicode character name

--

___
Python tracker 
<https://bugs.python.org/issue46410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46515] Benefits Of Phool Makhana

2022-01-25 Thread Matthew Barnett


Change by Matthew Barnett :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46515>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43224] Add support for PEP 646

2022-01-30 Thread Matthew Rahtz


Change by Matthew Rahtz :


--
pull_requests: +29199
pull_request: https://github.com/python/cpython/pull/31018

___
Python tracker 
<https://bugs.python.org/issue43224>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43224] Add support for PEP 646

2022-01-30 Thread Matthew Rahtz


Change by Matthew Rahtz :


--
pull_requests: +29200
pull_request: https://github.com/python/cpython/pull/31019

___
Python tracker 
<https://bugs.python.org/issue43224>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43224] Add support for PEP 646

2022-01-30 Thread Matthew Rahtz


Change by Matthew Rahtz :


--
pull_requests: +29202
pull_request: https://github.com/python/cpython/pull/31021

___
Python tracker 
<https://bugs.python.org/issue43224>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46589] Improve documentation for typing._GenericAlias

2022-01-30 Thread Matthew Rahtz


New submission from Matthew Rahtz :

There's currently not much documentation in `typing.py` for `_GenericAlias`. 
Some fairly weird things go on in there, so it would be great to have more info 
in the class about what's going on and why various edge cases are necessary.

--
components: Library (Lib)
messages: 412171
nosy: matthew.rahtz
priority: normal
pull_requests: 29210
severity: normal
status: open
title: Improve documentation for typing._GenericAlias
type: enhancement
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue46589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42369] Reading ZipFile not thread-safe

2022-02-01 Thread Matthew Davis


Matthew Davis  added the comment:

In addition to fixing any unexpected behavior, can we update the documentation 
[1] to state what the expected behavior is in terms of thread safety?

[1] https://docs.python.org/3/library/zipfile.html

--
nosy: +mdavis-xyz

___
Python tracker 
<https://bugs.python.org/issue42369>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46617] CSV Creation occasional off by one error

2022-02-02 Thread Matthew Stidham


New submission from Matthew Stidham :

The file which I found the error in is in 
https://github.com/greearb/lanforge-scripts

--
components: C API
files: debug from pandas failure.txt
messages: 412400
nosy: matthewstidham
priority: normal
severity: normal
status: open
title: CSV Creation occasional off by one error
type: compile error
versions: Python 3.10, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file50601/debug from pandas failure.txt

___
Python tracker 
<https://bugs.python.org/issue46617>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46617] CSV Creation occasional off by one error

2022-02-02 Thread Matthew Stidham


Matthew Stidham  added the comment:

the problem was a file in our library screwing up the python configuration

--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46617>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43478] Disallow Mock spec arguments from being Mocks

2022-02-02 Thread Matthew Suozzo


Change by Matthew Suozzo :


--
pull_requests: +29275
pull_request: https://github.com/python/cpython/pull/31090

___
Python tracker 
<https://bugs.python.org/issue43478>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46627] Regex hangs indefinitely

2022-02-03 Thread Matthew Barnett


Matthew Barnett  added the comment:

That pattern has:

(?P[^]]+)+

Is that intentional? It looks wrong to me.

--

___
Python tracker 
<https://bugs.python.org/issue46627>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46825] slow matching on regular expression

2022-02-22 Thread Matthew Barnett


Matthew Barnett  added the comment:

The expression is a repeated alternative where the first alternative is a 
repeat. Repeated repeats can result in a lot of attempts and backtracking and 
should be avoided.

Try this instead:

(0|1(01*0)*1)+

--

___
Python tracker 
<https://bugs.python.org/issue46825>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1077] itertools missing, causes interactive help to break

2007-08-31 Thread Matthew Russell

Changes by Matthew Russell:


--
components: Interpreter Core, Library (Lib)
files: py3k_bug1.txt
severity: urgent
status: open
title: itertools missing, causes interactive help to break
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1077>
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1269] Exception in pstats print_callers()

2007-11-21 Thread Matthew Fremont

Matthew Fremont added the comment:

I hit the same issue, and I think the problem is that at line 515 in
pstats.py add_callers() the two stats instead of adding them
member-wise. As a result, each time add() is called, the number of stats
associated with each func grows by 4. Then, when print_call_line() is
called by print_callers() or print_callees(), there are "too many values
to unpack" at line 417.

This change to pstats.py modifies add_callers() to add the stats
together instead of concatenating the tuples.

515c515
< new_callers[func] = caller + new_callers[func]
---
> new_callers[func] = map(lambda x,y: x+y, caller +
new_callers[func])

--
nosy: +matthew.fremont

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1269>
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12984] XML NamedNodeMap ( attribName in NamedNodeMap fails )

2011-09-14 Thread Matthew Newcomb

New submission from Matthew Newcomb :

I was cleaning up some old code to make it pep8 compliant and came across this 
bug.  Switching from 'has_key' to 'in' does not work with a 
xml.dom.minidom.NamedNodeMap.  An easy solution appears to be to add a 
'__contains__' method to NamedNodeMap.  

This is with:
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) 
[GCC 4.5.2] on linux2

I've tried it with 2.4.3 as well ( on some older machines ).

import xml.dom.minidom

def show_bug():
document = 'foobar_attributes'
dom = xml.dom.minidom.parseString(document)
attribs = dom.getElementsByTagName('dom')[0].attributes

if attribs.has_key('mbid'):
print "This works..  found 'mbid' attribute"

if 'mbid' in attribs:
print "Will never get here, the above will throw an exception"


>>> show_bug()
This works..  found 'mbid' attribute
Traceback (most recent call last):
  File "", line 1, in 
  File "show_bug.py", line 11, in show_bug
if 'mbid' in attribs:
  File "/usr/lib/python2.7/xml/dom/minidom.py", line 524, in __getitem__
return self._attrs[attname_or_tuple]
KeyError: 0

--
components: XML
files: show_bug.py
messages: 144060
nosy: spolematt
priority: normal
severity: normal
status: open
title: XML NamedNodeMap ( attribName in NamedNodeMap fails )
type: crash
versions: Python 2.7
Added file: http://bugs.python.org/file23157/show_bug.py

___
Python tracker 
<http://bugs.python.org/issue12984>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13169] Regular expressions with 0 to 65536 repetitions raises OverflowError

2011-10-13 Thread Matthew Barnett

Matthew Barnett  added the comment:

The quantifiers use 65535 to represent no upper limit, so ".{0,65535}" is 
equivalent to ".*".

For example:

>>> re.match(".*", "x" * 10).span()
(0, 10)
>>> re.match(".{0,65535}", "x" * 10).span()
(0, 10)

but:

>>> re.match(".{0,65534}", "x" * 10).span()
(0, 65534)

--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue13169>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13169] Regular expressions with 0 to 65536 repetitions raises OverflowError

2011-10-14 Thread Matthew Barnett

Matthew Barnett  added the comment:

The limit is an implementation detail. The pattern is compiled into codes which 
are then interpreted, and it just happens that the codes are (usually) 16 bits, 
giving a range of 0..65535, but it uses 65535 to represent no limit and doesn't 
warn if you actually write 65535.

There's an alternative regex implementation here:

http://pypi.python.org/pypi/regex

--

___
Python tracker 
<http://bugs.python.org/issue13169>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13475] Add '-p'/'--path0' command line option to override sys.path[0] initialisation

2011-11-27 Thread Matthew Woodcraft

Matthew Woodcraft  added the comment:

The proposed --nopath0 option is something I've wished I had in the past.

If this is added, it would be good if it could be given a single-letter form 
too, because it's an option that would be useful in #! lines (they don't 
reliably support using more than one command-line argument, and single-letter 
switches can be combined while long-form ones can't).

--
nosy: +mattheww

___
Python tracker 
<http://bugs.python.org/issue13475>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13592] repr(regex) doesn't include actual regex

2011-12-13 Thread Matthew Barnett

Matthew Barnett  added the comment:

In reply to Ezio, the repr of a large string, list, tuple or dict is also long.

The repr of a compiled regex should probably also show the flags, but should it 
just be the numeric value?

--

___
Python tracker 
<http://bugs.python.org/issue13592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13592] repr(regex) doesn't include actual regex

2011-12-13 Thread Matthew Barnett

Matthew Barnett  added the comment:

Actually, one possibility that occurs to me is to provide the flags within the 
pattern. The .pattern attribute gives the original pattern, but repr could give 
the flags in-line at the start of the pattern:

>>> # Assuming Python 3.
>>> r = re.compile("a", re.I)
>>> r.flags
34
>>> r.pattern
'a'
>>> repr(r)
"<_sre.SRE_Pattern '(?i)a'>"

I'm not sure how to make it eval-able, unless you mean something more like:

>>> repr(r)
"re.Regex('(?i)a')"

where re.Regex == re.compile, which would be more meaningful than:

>>> repr(r)
"re.compile('(?i)a')"

--

___
Python tracker 
<http://bugs.python.org/issue13592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13592] repr(regex) doesn't include actual regex

2011-12-22 Thread Matthew Barnett

Matthew Barnett  added the comment:

I'm just adding this to the regex module and I've come up against a possible 
issue. The regex module supports named lists, which could be very big. Should 
the entire contents of those lists also be shown in the repr?They would have to 
be if the repr is to be a eval-able.

--

___
Python tracker 
<http://bugs.python.org/issue13592>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13652] Creating lambda functions in a loop has unexpected results when resolving variables used as arguments

2011-12-22 Thread Matthew Barnett

Matthew Barnett  added the comment:

That's not a bug.

This might help to explain what's going on:

What do (lambda) function closures capture in Python?
http://stackoverflow.com/questions/2295290/what-do-lambda-function-closures-capture-in-python

--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue13652>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12177] re.match raises MemoryError

2011-05-25 Thread Matthew Barnett

Matthew Barnett  added the comment:

This also raises MemoryError:

re.match(r'()*?1', 'a1')

but none of these do:

re.match(r'()+1', 'a1')
re.match(r'()*1', 'a1')

--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12177>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12177] re.match raises MemoryError

2011-05-28 Thread Matthew Boehm

Matthew Boehm  added the comment:

Here are some windows results with Python 2.7:

>>> import re
>>> re.match("()*?1", "1")
<_sre.SRE_Match object at 0x025C0E60>
>>> re.match("()+?1", "1")
>>> re.match("()+?1", "11")
<_sre.SRE_Match object at 0x025C0E60>
>>> re.match("()*?1", "11")
<_sre.SRE_Match object at 0x025C3C60>
<_sre.SRE_Match object at 0x025C3C60>
>>> re.match("()*?1", "a1")

Traceback (most recent call last):
  File "", line 1, in 
re.match("()*?1", "a1")
  File "C:\Python27\lib\re.py", line 137, in match
return _compile(pattern, flags).match(string)
MemoryError
>>> re.match("()+?1", "a1")

Traceback (most recent call last):
  File "", line 1, in 
re.match("()+?1", "a1")
  File "C:\Python27\lib\re.py", line 137, in match
return _compile(pattern, flags).match(string)
MemoryError

Note that when matching to a string starting with "1", the matcher will not 
throw a MemoryError.

--
nosy: +Matthew.Boehm

___
Python tracker 
<http://bugs.python.org/issue12177>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12280] If statement

2011-06-07 Thread Matthew Brunt

New submission from Matthew Brunt :

i'm very new to python (currently going through a python for beginners book at 
work to pass the time), and i'm having trouble with an if statement exercise.  
basically, i'm creating a very simple password program that displays "Access 
Granted" if the if statement is true.  the problem i'm having is that no matter 
what i set the password to, it seems like it's ignoring the if statement.  the 
code is copied below, and i greatly appreciate any input.

# Granted or Denied
# Demonstrates an else clause

print("Welcome to System Security Inc.")
print("-- where security is our middle name\n")

password = input("Enter your password: ")

if password == "a":
print("Access Granted")

input("\n\nPress the enter key to exit.")

--
components: IDLE
messages: 137883
nosy: Matthew.Brunt
priority: normal
severity: normal
status: open
title: If statement
type: performance
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue12280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2011-07-11 Thread Matthew Barnett

Matthew Barnett  added the comment:

The new regex imlementation is hosted here: 
https://code.google.com/p/mrab-regex-hg/

The span of m['a_thing'] is m.span('a_thing'), if that helps.

The named groups are listed on the pattern object, which can be accessed via 
m.re:

>>> m.re
<_regex.Pattern object at 0x0161DE30>
>>> m.re.groupindex
{'another_thing': 3, 'a_thing': 1}

so you can use that to create a reverse dict to go from the index to the name 
or None. (Perhaps the pattern object should have such a .group_name attribute.)

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett

New submission from Matthew Barnett :

Someone over at StackOverflow had a problem with urlopen in Python 3.2.1:


http://stackoverflow.com/questions/6892573/problem-with-urlopen/6892843#6892843

This is the code:

from urllib.request import urlopen
f = 
urlopen('http://online.wsj.com/mdc/public/page/2_3020-tips.html?mod=topnav_2_3000')
page = f.read()
f.close()

With Python 3.1 and Python 3.2 it works OK, but with Python 3.2.1 the
read returns an empty string.

--
components: Library (Lib)
messages: 141481
nosy: mrabarnett
priority: normal
severity: normal
status: open
title: urlopen returning empty string
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue12671>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett

Matthew Barnett  added the comment:

Just been told this bug has already been reported as issue #12576.

--
resolution:  -> duplicate

___
Python tracker 
<http://bugs.python.org/issue12671>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12671] urlopen returning empty string

2011-07-31 Thread Matthew Barnett

Changes by Matthew Barnett :


--
status: open -> closed

___
Python tracker 
<http://bugs.python.org/issue12671>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12723] tkSimpleDialog.askstring shouldn't allow empty string input

2011-08-10 Thread Matthew Hemke

New submission from Matthew Hemke :

tkSimpleDialog.askstring allows empty input. The attached patch adds validation 
to the input to ensure it is not empty.

--
components: Tkinter
files: askstring.patch
keywords: patch
messages: 141868
nosy: rabbidous
priority: normal
severity: normal
status: open
title: tkSimpleDialog.askstring shouldn't allow empty string input
type: feature request
Added file: http://bugs.python.org/file22875/askstring.patch

___
Python tracker 
<http://bugs.python.org/issue12723>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12723] tkSimpleDialog.askstring shouldn't allow empty string input

2011-08-10 Thread Matthew Hemke

Changes by Matthew Hemke :


--
versions: +Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12723>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12723] tkSimpleDialog.askstring shouldn't allow empty string input

2011-08-11 Thread Matthew Hemke

Matthew Hemke  added the comment:

What about adding a validatecommand option like on Tkinter.Entry?

For what I am trying to do it was sort of a kludge to validate the entry 
because an empty string was invalid, but in the interface design, it would have 
been "rude" to validate after the dialog closes and then keep popping up 
another tkSimpleDialog.askstring until the input is correct. It almost makes 
askstring useless because I can't validate on close.

That wouldn't break backwards compatibility would it?

--

___
Python tracker 
<http://bugs.python.org/issue12723>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12723] Provide an API in tkSimpleDialog for defining custom validation functions

2011-08-12 Thread Matthew Hemke

Matthew Hemke  added the comment:

I'm not sure if I misunderstood you, or you misunderstood me, but adding an 
option to the askstring dialog that would take a function handle would also 
allow you to use it for things other than strings (ints,etc.)

Tkinter Entry does this: you set the validatecommand option to a function 
handle that returns true or false to determine whether the input was valid.

I will try and code an example over the weekend.

--

___
Python tracker 
<http://bugs.python.org/issue12723>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12728] Python re lib fails case insensitive matches on Unicode data

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12728>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12730>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12731>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12732] Can't portably use Unicode in Python identifiers

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12732>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12733] Request for grapheme support in Python re lib

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12733>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12734] Request for property support in Python re lib

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12734>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12735] request full Unicode collation support in std python library

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-12 Thread Matthew Barnett

Changes by Matthew Barnett :


--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue12736>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-12 Thread Matthew Barnett

Matthew Barnett  added the comment:

In a narrow build, a codepoint in the astral plane is encoded as surrogate pair.

I could implement a workaround for it in the regex module, but I think that the 
proper place to fix it is in the language as a whole, perhaps by implementing 
PEP 393 ("Flexible String Representation").

--

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-13 Thread Matthew Barnett

Matthew Barnett  added the comment:

There are occasions when you want to do string slicing, often of the form:

pos = my_str.index(x)
endpos = my_str.index(y)
substring = my_str[pos : endpos]

To me that suggests that if UTF-8 is used then it may be worth profiling to see 
whether caching the last 2 positions would be beneficial.

--

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-13 Thread Matthew Barnett

Matthew Barnett  added the comment:

You're right about starting the second search from where the first finished. 
Caching the position would be an advantage there.

The memory cost of extra pointers wouldn't be so bad if UTF-8 took less space 
than the current format.

Regex isn't used as much as in Perl. BTW, the current re module was introduced 
in Python 1.5, the previous regex and regsub modules being removed in Python 
2.5.

--

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12749] lib re cannot match non-BMP ranges (all versions, all builds)

2011-08-14 Thread Matthew Barnett

Matthew Barnett  added the comment:

On a narrow build, "\N{MATHEMATICAL SCRIPT CAPITAL A}" is stored as 2 code 
units, and neither re nor regex recombine them when compiling a regex or 
looking for a match.

regex supports \xNN, \u and \U and \N{XYZ} itself, so they can be 
used in a raw string literal, but it doesn't recombine code units.

I could add recombination to regex at some point if time has passed and no 
further progress has been made in the language's support for Unicode.

--

___
Python tracker 
<http://bugs.python.org/issue12749>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-14 Thread Matthew Barnett

Matthew Barnett  added the comment:

Have a look here: http://98.245.80.27/tcpc/OSCON2011/gbu/index.html

--

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-15 Thread Matthew Barnett

Matthew Barnett  added the comment:

For what it's worth, I've had idea about string storage, roughly based on how 
*nix stores data on disk.

If a string is small, point to a block of codepoints.

If a string is medium-sized, point to a block of pointers to codepoint blocks.

If a string is large, point to a block of pointers to pointer blocks.

This means that a large string doesn't need a single large allocation.

The level of indirection can be increased as necessary.

For simplicity, all codepoint blocks contain the same number of codepoints, 
except the final codepoint block, which may contain fewer.

A codepoint block may use the minimum width necessary (1, 2 or 4 bytes) to 
store all of its codepoints.

This means that there are no surrogates and that different sections of the 
string can be stored in different widths to reduce memory usage.

--

___
Python tracker 
<http://bugs.python.org/issue12729>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7105] weak dict iterators are fragile because of unpredictable GC runs

2011-08-18 Thread Matthew Schwickerath

Matthew Schwickerath  added the comment:

Any plans on actually patching this in 2.7 any time soon?  This is affecting 
our software and hanging it on random occasions.

--
nosy: +qelan

___
Python tracker 
<http://bugs.python.org/issue7105>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Matthew Barnett

Matthew Barnett  added the comment:

For the "Line_Break" property, one of the possible values is "Inseparable", 
with 2 permitted aliases, the shorter "IN" (which is reasonable) and 
"Inseperable" (ouch!).

--

___
Python tracker 
<http://bugs.python.org/issue12753>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12789] re.Scanner don't support more then 2 groups on regex

2011-08-20 Thread Matthew Barnett

Matthew Barnett  added the comment:

Even if this bug is fixed, it still won't work as you expect, and this s why.

The Scanner function accepts a list of 2-tuples. The first item of the tuple is 
a regex and the second is a function. For example:

re.Scanner([(r"\d+", number), (r"\w+", word)])

The Scanner function then builds a regex, using the given regexes as 
alternatives, each wrapped as a capture group:

r"(\d+)|(\w+)"

When matching, it sees which group captured and uses that to decide which 
function it should call, so, for example, if group 1 matched, it calls 
"number", and if group 2 matched, it calls "word".

When you introduce capture groups into the regexes, it gets confused. If your 
regex matches, it'll see that groups 1 and 2 match, so it'll try to call the 
second function, but there's isn't one...

--

___
Python tracker 
<http://bugs.python.org/issue12789>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-27 Thread Matthew Barnett

Matthew Barnett  added the comment:

There are some oddities in Unicode case-folding.

Under full case-folding, both "\N{LATIN CAPITAL LETTER SHARP S}" and "\N{LATIN 
SMALL LETTER SHARP S}" fold to "ss", which means that those codepoints match 
each other.

However, under simple case-folding, they fold to themselves, which means that 
those codepoints _don't_ match each other.

--

___
Python tracker 
<http://bugs.python.org/issue12736>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-28 Thread Matthew Barnett

Matthew Barnett  added the comment:

The regex module currently uses simple case-folding, although I'm working 
towards full case-folding, as listed in 
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt.

--

___
Python tracker 
<http://bugs.python.org/issue12736>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm

New submission from Matthew Boehm :

A file opened with codecs.open() splits on a form feed character (\x0c) while a 
file opened with open() does not.

>>> with open("formfeed.txt", "w") as f:
...   f.write("line \fone\nline two\n")
...
>>> with open("formfeed.txt", "r") as f:
...   s = f.read()
...
>>> s
'line \x0cone\nline two\n'
>>> print s
line
one
line two

>>> import codecs
>>> with open("formfeed.txt", "rb") as f:
...   lines = f.readlines()
...
>>> lines
['line \x0cone\n', 'line two\n']
>>> with codecs.open("formfeed.txt", "r", encoding="ascii") as f:
...   lines2 = f.readlines()
...
>>> lines2
[u'line \x0c', u'one\n', u'line two\n']
>>>

Note that lines contains two items while lines2 has 3.

Issue 7643 has a good discussion on newlines in python, but I did not see this 
discrepancy mentioned.

--
components: Interpreter Core
messages: 143182
nosy: Matthew.Boehm
priority: normal
severity: normal
status: open
title: open() and codecs.open() treat form-feed differently
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm

Matthew Boehm  added the comment:

Thanks for explaining the reasoning.

Perhaps I should add this to the python wiki 
(http://wiki.python.org/moin/Unicode) ?

It would be nice if it fit in the docs somewhere, but I'm not sure where.

I'm curious how (or if) 2to3 would handle this as well, but I'm closing this 
issue as it's now clear to me why these two are expected to act differently.

--

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm

Changes by Matthew Boehm :


--
resolution:  -> wont fix
status: open -> closed

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm

Matthew Boehm  added the comment:

I'll suggest a patch for the documentation when I get to my home computer in an 
hour or two.

--
assignee:  -> docs@python
components: +Documentation -Interpreter Core
nosy: +docs@python
resolution: wont fix -> 
status: closed -> open

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] open() and codecs.open() treat form-feed differently

2011-08-29 Thread Matthew Boehm

Matthew Boehm  added the comment:

I'm taking a look at the docs now.

I'm considering adding a table/list of characters python treats as newlines, 
but it seems like this might fit better as a note in 
http://docs.python.org/library/stdtypes.html#str.splitlines or somewhere else 
in stdtypes. I'll start working on it now, but please let me know what you 
think about this.

This is my first attempt at a patch, so I greatly appreciate your help so far.

--

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] linebreak sequences should be better documented

2011-08-29 Thread Matthew Boehm

Matthew Boehm  added the comment:

I've attached a patch for python2.7 that adds a small not to 
library/stdtypes.html#str.splitlines explaining which sequences are treated as 
line breaks:

"""
Note: Python recognizes "\r", "\n", and "\r\n" as line boundaries for strings.

In addition to these, Unicode strings can have line boundaries of u"\x0b", 
u"\x0c", u"\x85", u"\u2028", and u"\u2029"
"""

Additional thoughts:

* Would it be better to put this note in a different place?

* It looks like \x0b and \x0c (vertical tab and form feed) were first 
considered line breaks in Python 2.7, probably related to this note from 
"What's New in 2.7": "The Unicode database provided by the unicodedata module 
is now used internally to determine which characters are numeric, whitespace, 
or represent line breaks." It might be worth putting a "changed in 2.7" note 
somewhere in the docs.

Please let me know of any thoughts you have and I'll be glad to make any 
desired changes and submit a new patch.

--
keywords: +patch
title: open() and codecs.open() treat form-feed differently -> linebreak 
sequences should be better documented
Added file: http://bugs.python.org/file23069/linebreakdoc.py27.patch

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm

Matthew Boehm  added the comment:

I can fix the patch to list all the unicode line boundaries. The three places 
I've considered putting it are:

1. On the howto/unicode.html

2. Somewhere in the stdtypes.html#typesseq description (maybe with other notes 
at the bottom)

3. As a note to the stdtypes.html#str.splitlines method description (where it 
is in the previous patch.)

I can move it to any of these places if you think it's a better fit. I'll fix 
the list so that it's complete, add a note about \x0b and \x0c being added in 
2.7/3.2, and possibly reference it from StreamReader.readline.

After confirming that my documentation matches the style guide, I'll make the 
docs, test the output, and upload a patch. I can do this for 2.7, 3.2 and 3.3 
separately.

Let me know if that sounds good and if you have any further thoughts. I should 
be able to upload new patches in 10 hours (after work today).

--

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm

Matthew Boehm  added the comment:

I've attached a patch for 2.7 and will attach one for 3.2 in a minute.

I built the docs for both 2.7 and 3.2 and verified that there were no warnings 
and that the resulting web pages looked okay.

Things to consider:

* Placement of unicode.splitlines() method: I placed it next to str.splitlines. 
I didn't want to place it with the unicode methods further down because docs 
say "The following methods are present only on unicode objects"

* The docs for codecs.readlines() already mentions "Line-endings are 
implemented using the codec’s decoder method and are included in the list 
entries if keepends is true." 

* Feel free to make any wording/style suggestions.

--
Added file: http://bugs.python.org/file23076/linebreakdoc.v2.py27.patch

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12855] linebreak sequences should be better documented

2011-08-30 Thread Matthew Boehm

Changes by Matthew Boehm :


Added file: http://bugs.python.org/file23077/linebreakdoc.v2.py32.patch

___
Python tracker 
<http://bugs.python.org/issue12855>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2011-09-01 Thread Matthew Barnett

Matthew Barnett  added the comment:

The regex module supports nested sets and set operations, eg. 
r"[[a-z]--[aeiou]]" (the letters from 'a' to 'z', except the vowels). This 
means that literal '[' in a set needs to be escaped.

For example, re module sees "[][()]..." as:

[  start of set
 ] literal ']'
 [()   literals '[', '(', ')'
]  end of set
...   ...

but the regex module sees it as:

[  start of set
 ] literal ']'
 [()]  nested set [()]
 ...   ...

Thus:

>>> s = u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub(r'(?<=[][()]) |(?!,) (?!\[,)(?=[][(),])', '', s)
u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub('(?<=[]\[()]) |(?!,) (?!\[,)(?=[]\[(),])', '', s)
u'void foo(type arg1 [, type arg2])'

If it can't parse it as a nested set, it tries again as a non-nested set (like 
re), but there are bound to be regexes where it could be either.

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2011-09-01 Thread Matthew Barnett

Matthew Barnett  added the comment:

I think I need a show of hands.

Should the default be old behaviour (like re) or new behaviour? (It might be 
old now, new later.)

Should there be a NEW flag (as at present), or an OLD flag, or a VERSION 
parameter (0=old, 1=new, 2=?)?

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2011-09-02 Thread Matthew Barnett

Matthew Barnett  added the comment:

The least disruptive change would be to have a NEW flag for the new behaviour, 
as at present, and an OLD flag for the old behaviour.

Currently the default is old behaviour, but in the future it will be new 
behaviour.

The differences would be:

Old behaviour   : New behaviour
- -
Global inline flags : Positional inline flags
Can't split on zero-width match : Can split on zero-width match
Simple sets : Nested sets and set operations

The only change would be that nested sets wouldn't be supported in the old 
behaviour.

There are also additional escape sequences, eg \X is no longer treated as "X", 
but as they look like escape sequences you really shouldn't be relying on that. 
(It's similar to writing Windows paths in non-raw string literals: "\T" == 
"\\T", but "\t" == chr(9).)

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2011-09-02 Thread Matthew Barnett

Matthew Barnett  added the comment:

So, VERSION0 and VERSION1, with "(?V0)" and "(?V1)" in the pattern?

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7951] Should str.format allow negative indexes when used for __getitem__ access?

2010-08-11 Thread Matthew Barnett

Matthew Barnett  added the comment:

I agree with Kamil and Germán. I would've expected negative indexes for 
sequences to work. Negative indexes for fields is a different matter.

--

___
Python tracker 
<http://bugs.python.org/issue7951>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-14 Thread Matthew Barnett

Matthew Barnett  added the comment:

issue2636-20100814.zip is a new version of the regex module.

I've added default Unicode word boundaries and renamed the Pattern and Match 
classes.

Over to you, Alex. :-)

--
Added file: http://bugs.python.org/file18532/issue2636-20100814.zip

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7255] "Default" word boundaries for Unicode data?

2010-08-14 Thread Matthew Barnett

Matthew Barnett  added the comment:

These have been added to the new 'regex' module. See issue #2636 or PyPI at:

http://pypi.python.org/pypi/regex

--
nosy: +mrabarnett

___
Python tracker 
<http://bugs.python.org/issue7255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7255] "Default" word boundaries for Unicode data?

2010-08-15 Thread Matthew Barnett

Matthew Barnett  added the comment:

If you're on Windows (x86, 32-bit) then compilation isn't necessary - just use 
the appropriate _regex.pyd.

--

___
Python tracker 
<http://bugs.python.org/issue7255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-15 Thread Matthew Barnett

Matthew Barnett  added the comment:

issue2636-20100816.zip is a new version of the regex module.

Unfortunately I came across a bug in the handing of sets. More unit tests added.

--
Added file: http://bugs.python.org/file18541/issue2636-20100816.zip

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-08-23 Thread Matthew Barnett

Matthew Barnett  added the comment:

issue2636-20100824.zip is a new version of the regex module.

More speedups. Getting towards Perl speed now, depending on the regex. :-)

--
Added file: http://bugs.python.org/file18621/issue2636-20100824.zip

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9802] Document 'stability' of builtin min() and max()

2010-09-08 Thread Matthew Woodcraft

New submission from Matthew Woodcraft :

In CPython, the builtin max() and min() have the property that if there are 
items with equal keys, the first item is returned. From a quick look at their 
source, I think this is true for Jython and IronPython too.

I propose making this a documented guarantee.

On Python-dev, Raymond Hettinger said:
<<
That seems like a reasonable request. This behavior has been around for a very 
long time is unlikely to change. Elsewhere, we've made efforts to document sort 
stability (i.e. sorted(), heapq.nlargest(), heapq.nsmallest, merge(), etc).
>>
(<http://mail.python.org/pipermail/python-dev/2010-September/103543.html>)

I'm attaching a patch with a concrete suggestion for a change to
functions.rst, modelled on the documentation of heapq.nlargest().

--
assignee: d...@python
components: Documentation
files: maxmin.patch
keywords: patch
messages: 115892
nosy: d...@python, mattheww
priority: normal
severity: normal
status: open
title: Document 'stability' of builtin min() and max()
type: feature request
Added file: http://bugs.python.org/file18802/maxmin.patch

___
Python tracker 
<http://bugs.python.org/issue9802>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-11 Thread Matthew Barnett

Matthew Barnett  added the comment:

issue2636-20100912.zip is a new version of the regex module.

More speedups. I've been comparing the speed against Perl wherever possible. In 
some cases Perl is lightning fast, probably because regex is built into the 
language and it doesn't have to parse method arguments (for some short regexes 
a large part of the processing time is spent in PyArg_ParseTupleAndKeywords!). 
In other cases, where it has to use Unicode codepoints outside the 8-bit range, 
or character properties such as \p{Alpha}, its performance is simply appalling! 
:-)

--
Added file: http://bugs.python.org/file18854/issue2636-20100912.zip

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9802] Document 'stability' of builtin min() and max()

2010-09-12 Thread Matthew Woodcraft

Matthew Woodcraft  added the comment:

> (1) Shouldn't 'reverse=True' be omitted in the second doc
> addition?

Yes, of course, sorry.

> (2) I'd also suggest adding a brief comment about what this
> means for distinct, but equal, objects; otherwise it's not
> really obvious what the point of the doc addition is.

> (3) As a matter of clarity, perhaps replace "this is" with
> "max(iterable, key=key) is", and similarly for min.

I've attached a new patch incorporating these suggestions.

--
Added file: http://bugs.python.org/file18858/functions.rst.patch

___
Python tracker 
<http://bugs.python.org/issue9802>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett

Matthew Barnett  added the comment:

Another flag? Hmm.

How about this instead: if a scoped flag appears at the end of a regex (and 
would therefore normally have no effect) then it's treated as though it's at 
the start of the regex. Thus:

foo(?i)

is treated like:

(?i)foo

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett

Matthew Barnett  added the comment:

The tests for re include these regexes:

a.b(?s)
a.*(?s)b

I understand what Georg said previously about some people preferring to put 
them at the end, but I personally wouldn't do that because some regex 
implementations support scoped inline flags, although others, like re, don't.

I think that second regex is a bit perverse, though! :-)

On the other matter, I could make the Unicode script and block available 
through a couple of functions if you need them, eg:

# Using Python 3 here
>>> regex.script("A")
'Latin'
>>> regex.block("A")
'BasicLatin'

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-09-12 Thread Matthew Barnett

Matthew Barnett  added the comment:

OK, so would it be OK if there was, say, a NEW (N) flag which made the inline 
flags (?flags) scoped and allowed splitting on zero-width matches?

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   >