Matthew Barnett added the comment:
Python takes a long way round when converting strings to int. It does the
following (I'll be talking about Python 3.3 here):
1. In function 'fix_decimal_and_space_to_ascii', the different kinds of spaces
are converted to " " and the
Matthew Barnett added the comment:
It occurred to me that the truncation of the string when building the error
message could cause a UnicodeDecodeError:
>>> int("1".ljust(199) + "\u0100")
Traceback (most recent call last):
File "", line
Matthew Barnett added the comment:
I've attached a patch.
It now reports an invalid literal as-is:
>>> int("#\N{ARABIC-INDIC DIGIT ONE}")
Traceback (most recent call last):
File "", line 1, in
int("#\N{ARABIC-INDIC DIGIT ONE}")
ValueError:
Matthew Barnett added the comment:
I've attached a small additional patch for truncating the UTF-8.
I don't know whether it's strictly necessary, but I don't know that it's
unnecessary either! (Better safe than sorry.)
--
Added file: http://bugs.python.org/fil
Matthew Barnett added the comment:
The semantics of '^' are common to many different regex implementations,
including those of Perl and C#.
The 'pos' argument merely gives the starting position the search (C# also lets
you provide a starting position, and behaves in
Matthew Barnett added the comment:
I've attached a patch.
--
keywords: +patch
Added file: http://bugs.python.org/file28614/issue13899.patch
___
Python tracker
<http://bugs.python.org/is
Matthew Barnett added the comment:
I've attached my attempt at a patch.
--
keywords: +patch
Added file: http://bugs.python.org/file28744/issue9669.patch
___
Python tracker
<http://bugs.python.org/i
Matthew Barnett added the comment:
Lines 1000 and 1084 will be a problem only if you're near the top of the
address space. This is because:
1. ctx->pattern[1] will always be <= ctx->pattern[2].
2. A value of 65535 in ctx->pattern[2] means unlimited, even though SRE_CODE i
Matthew Barnett added the comment:
You're checking "int offset", but what happens with "unsigned int offset"?
--
___
Python tracker
<http:
Matthew Barnett added the comment:
IMHO, I don't think that MAXREPEAT should be defined in sre_constants.py _and_
SRE_MAXREPEAT defined in sre_constants.h. (In the latter case, why is it in
decimal?)
I think that it should be defined in one place, namely sre_constants.h, perhaps
as:
#d
Matthew Barnett added the comment:
I've attached a patch.
--
Added file: http://bugs.python.org/file28955/issue16203_mrab.patch
___
Python tracker
<http://bugs.python.org/is
Matthew Barnett added the comment:
3 of the tests expect None when using 'fullmatch'; they won't return None when
using 'match'.
--
___
Python tracker
<http:
Matthew Barnett added the comment:
These are the ones that I think are wrong:
Doc/c-api/long.rst:206
Return a C :c:type:`size_t` representation of of *pylong*. *pylong* must be
Doc/c-api/long.rst:218
Return a C :c:type:`unsigned PY_LONG_LONG` representation of of *pylong*.
Doc
Matthew Barnett added the comment:
It does look like a duplicate to me.
--
___
Python tracker
<http://bugs.python.org/issue17184>
___
___
Python-bugs-list mailin
Matthew Barnett added the comment:
The behaviour is correct.
Here's a summary of what's happening:-
First iteration of the repeated group:
Try the first branch. Can match "a".
Second iteration of the repeated group:
Try the first branch. Can't match "
Matthew Barnett added the comment:
The bytestring literal isn't valid. It starts with b" and later on has an
unescaped " followed by more characters.
Also, the usual way to decode by using the .decode method.
I get this:
>>> content = b"+1911\' rel=\'st
New submission from Matthew Earl:
datetime.datetime.strptime() without a year fails on Feb 29 with:
>>> datetime.datetime.strptime("Feb 29", "%b %d")
Traceback (most recent call last):
File "", line 1, in
File "/auto/ensoft-sjc/thirdpar
Matthew Earl added the comment:
Out of interest, what's the reason for accepting the time.strptime() version as
a bug, but not datetime.datetime.strptime()? Is it that time.strptime() is
meant to be a simple parsing from string to tuple (with minimal checks),
wh
Matthew Barnett added the comment:
The traceback says "bad character range" because ord('+') == 43 and ord('*') ==
42. It's not surprising that it complains if the range isn't valid.
--
___
Python tra
Matthew Barnett added the comment:
Works for me: Python 2.7.5, 64-bit, Windows 8.1
--
nosy: +mrabarnett
___
Python tracker
<http://bugs.python.org/issue19
Matthew Barnett added the comment:
I don't know that it's not needed.
--
___
Python tracker
<http://bugs.python.org/issue16203>
___
___
Python-bugs-l
Matthew Barnett added the comment:
This issue is best posted to python-list and only posted here if it's agreed
that it's a bug.
Anyway:
1. You have "self.flows" and "flows", but haven't said what they are.
2. It's recommended that you don't modi
New submission from Matthew Bergin:
[level@ fuzz]# cat pyfile.py
import bz2
obj = bz2.BZ2File('/tmp/fileName')
obj.__init__("fileName")
obj.__reduce__
[level@ fuzz]# gdb --args python pyfile.py
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free
New submission from Matthew Bergin:
[level@ fuzz]# cat PyCFunction.py
#
# PyCFunction_NewEx crach poc (sigabrt)
#
import imageop
imageop.rgb82rgb(u"%J8CBej >uFBi-",True,8.36)
imageop.grey2grey(None,5,u"CRi")
[level@ fuzz]# gdb --args python PyCFunction.py
GNU gdb (GDB) R
Changes by Matthew Bergin :
--
type: -> crash
___
Python tracker
<http://bugs.python.org/issue19878>
___
___
Python-bugs-list mailing list
Unsubscrib
Matthew Bergin added the comment:
I was fuzzing the interpreter otherwise it would init itself
--
___
Python tracker
<http://bugs.python.org/issue19878>
___
___
Matthew Bergin added the comment:
I am going to test it against 2.7 a little later on this afternoon.
I typically host all of the code I write at https://github.com/levle but atm
the github repo I use to host the project is private. Once I work out some of
the kinks I will set it to Public
Matthew Bergin added the comment:
Sweet, I will check it out
--
___
Python tracker
<http://bugs.python.org/issue19879>
___
___
Python-bugs-list mailing list
Unsub
New submission from Matthew Gilson:
Reading the source for collections.Counter.most_common, the docstring mentions
that `n` can be `None` or omitted, but the online documentation does not
mention that `n` can be `None`.
--
assignee: docs@python
components: Documentation
messages
Matthew Gilson added the comment:
This is a very simple patch which addresses the issue. I am still curious
whether the reported function signature should be changed from:
.. method:: most_common([n])
to:
.. method:: most_common(n=None)
. Any thoughts?
Also, while I was in there
Matthew Barnett added the comment:
Lookarounds can contain capture groups:
>>> import re
>>> re.search(r'a(?=(.))', 'ab').groups()
('b',)
>>> re.search(r'(?<=(.))b', 'ab').groups()
('a',)
so lookaro
Matthew Barnett added the comment:
Lookarounds can capture, but they don't consume. That lookbehind is matching
the same part of the string every time.
--
___
Python tracker
<http://bugs.python.org/is
Matthew Barnett added the comment:
Match objects have a .groups method:
>>> import re
>>> m = re.match(r'(\w+):(\w+)', 'qwerty:asdfgh')
>>> m.groups()
('qwerty', 'asdfgh')
>>> k, v = m.groups()
>>> k
'
Matthew Barnett added the comment:
In a regex, '+' is a metacharacter meaning "repeated one or more times".
"libstdc+" will match "libstd" followed by "c" repeated one or more times.
"libstdc++" will match "libstd"
Matthew Barnett added the comment:
For comparison:
Python 3.1.3:
[(b'',)]
Python 3.2.5:
[(None,)]
Python 3.3.5:
[(b'',)]
Python 3.4.1:
sqlite3.OperationalError: trigger cannot use variables
--
nosy: +mrabarnett
___
P
Matthew Barnett added the comment:
> re:Cannot process flags argument with a compiled pattern
> regex: can't process flags argument with a compiled pattern
Error messages usually start with a lowercase letter, and I think that all the
other ones in the re module do.
By the wa
Matthew Barnett added the comment:
The support for locales in the re module is limited to those with 1 byte per
character, and only for a few properties (those provided by the underlying C
library), so maybe it could do the following:
If the LOCALE flag is set, then read the current locale
Matthew Barnett added the comment:
When you lookup the pattern in the cache, include the current locale as part of
the key if the pattern is locale-sensitive (you can let it be None if the
pattern is not locale-sensitive).
--
___
Python tracker
Matthew Barnett added the comment:
@Serhiy: You're overlooking that the LOCALE flag could be inline, e.g.
r'(?L)\w+'.
Basically, if you've seen the pattern before, you know whether it has an inline
LOCALE flag; if you haven't seen the pattern before, you'll need
Matthew Barnett added the comment:
In the regex module, I borrowed the \g<...> escape from .sub's replacement
string to provide an alternative way to refer to a group in a pattern, and that
let me remove the limit.
--
___
Python tra
Matthew Barnett added the comment:
For reference, the regex module normally considers the line ending to be '\n',
but it has a WORD flag ('(?w)') that turns on the Unicode definition of a
'word' character as
Matthew Barnett added the comment:
After some thought, I've come to the conclusion that the GCD of two integers
should be negative only if both of those integers are negative. The basic
algorithm is that you find all of the prime factors of the integers and then
return the product o
Matthew Barnett added the comment:
As it appears that there isn't general agreement on how to calculate the GCD
when negative numbers are involved, I needed to look for another way of
thinking about it.
Splitting off the sign as another factor was what I came up with.
Pragmatism beats p
Matthew Barnett added the comment:
+1 for leaving it to the user to make it negative if so desired.
--
___
Python tracker
<http://bugs.python.org/issue22
Matthew Barnett added the comment:
There's an interesting bit of history here:
http://www.gossamer-threads.com/lists/python/dev/236584
--
___
Python tracker
<http://bugs.python.org/is
Matthew Barnett added the comment:
I prefer to include the line and column numbers if it's a multi-line pattern,
not just if the line number is > 1.
BTW, it's shorter if you do this:
self.colno = pos - pattern.rfind(newline, 0, pos)
If there's no newline, .rf
Matthew Barnett added the comment:
It takes a long time due to excessive backtracking.
The regex implementation on PyPI finishes quickly because it contains some
extra logic to reduce the chances of that happening, but it could be tricky
trying to incorporate that into the existing re module
Matthew Barnett added the comment:
I don't know of any regex implementation that lets you do that.
--
type: behavior -> enhancement
___
Python tracker
<http://bugs.python.org
Matthew Barnett added the comment:
Yes.
If it's not a valid repeat, then it's treated as a literal.
Perl does the same.
By the way, "\1" isn't a group reference; it's the same as "\x01". You should
be either doubling the backslashes (&qu
Matthew Iversen added the comment:
Hi, I'm wondering why this branch was never merged in?
AFIAK, it's roundabout here -
http://hg.python.org/cpython/log/28e4cd8fd864/Lib/packaging/command/upload.py
It'd be great to have distutils submit forms that are compliant with the MIME
Matthew Iversen added the comment:
Sorry, I referenced http://bugs.python.org/issue12169 before.
distutils multipart/form-data encoding still breaks the spec for MIME, which
demands CRLF line endings.
Especially since it is now sending HTTP 1.1 requests which should conform.
The patch
Matthew Barnett added the comment:
issue2636-20090726.zip is a new implementation of the re engine. It
replaces re.py, sre.py, sre_constants.py, sre_parse.py and
sre_compile.py with a new re.py and replaces sre_constants.h, sre.h and
_sre.c with _re.h and _re.c.
The internal engine no longer
Matthew Barnett added the comment:
issue2636-20090727.zip contains regex.py, _regex.h, _regex.c and also
_regex.pyd (for Python 2.6 on Windows). For Windows machines just put
regex.py and _regex.pyd into Python's Lib\site-packages folder. I've
changed the name so that it won
Matthew Barnett added the comment:
issue2636-20090729.zip contains regex.py, _regex.h, _regex.c which will
work with Python 2.5 as well as Python 2.6, and also 2 builds of
_regex.pyd (for Python 2.5 and Python 2.6 on Windows).
This version supports accessing the capture groups by subscripting
Changes by Matthew Barnett :
Removed file: http://bugs.python.org/file14592/issue2636-20090729.zip
___
Python tracker
<http://bugs.python.org/issue2636>
___
___
Python-bug
Matthew Barnett added the comment:
Unfortunately I found a bug in regex.py, caused when I made it
compatible with Python 2.5. :-(
issue2636-20090729.zip is now corrected.
--
Added file: http://bugs.python.org/file14594/issue2636-20090729.zip
New submission from Matthew Russell :
Not sure if this should be (tentative) feature request or behavior...
It might help new comers and those preparing to port to Python 3.
--
assignee: georg.brandl
components: 2to3 (2.x to 3.0 conversion tool), Documentation, Interpreter Core
Changes by Matthew Russell :
--
title: Depricate iterable.next in Python > 2.6.x when called with -3 option ->
Deprecate iterable.next in Python > 2.6.x when called with -3 option ?
___
Python tracker
<http://bugs.python.o
Matthew Barnett added the comment:
I'd like to suggest that it the output could/should be encoded in UTF-8.
--
nosy: +mrabarnett
___
Python tracker
<http://bugs.python.org/i
Matthew Barnett added the comment:
I was thinking that if you're converting a Python 2.x script to Python
3.x using 2to3 then also encoding the new script in UTF-8 might be a
good idea.
--
___
Python tracker
<http://bugs.python.org/i
Matthew Barnett added the comment:
issue2636-20090804.zip is a new version of the regex module.
The memory leak has been fixed.
--
Added file: http://bugs.python.org/file14642/issue2636-20090804.zip
___
Python tracker
<http://bugs.python.
Matthew Barnett added the comment:
In a regular expression (...) will group and capture, whereas (?:...)
will only group and not capture.
--
nosy: +mrabarnett
___
Python tracker
<http://bugs.python.org/issue6
Matthew Barnett added the comment:
issue2636-20090810.zip should fix the empty-string bug.
--
Added file: http://bugs.python.org/file14682/issue2636-20090810.zip
___
Python tracker
<http://bugs.python.org/issue2
Matthew Barnett added the comment:
issue2636-20090810#2.zip has some further improvements and bugfixes.
--
Added file: http://bugs.python.org/file14683/issue2636-20090810#2.zip
___
Python tracker
<http://bugs.python.org/issue2
Matthew Barnett added the comment:
issue2636-20090810#3.zip adds more Unicode character properties such as
"\p{Lowercase_Letter}", and also Unicode script ranges.
In addition, the 'findall' method now accepts an 'overlapped' argument
for finding o
Matthew Barnett added the comment:
issue2636-20090815.zip fixes the bugs found in msg91598 and msg91607.
The regex engine currently lacks some of the optimisations that the re
engine has, but I've concluded that even with them the extra work that
the engine needs to do to make it ea
Matthew Barnett added the comment:
"(?![a-z0-9])" is a negative lookahead, so "(?![a-z0-9])0" is saying
that the next character shouldn't be any of [a-z0-9], yet it should
match "0". Hence, no matches.
--
nosy: +mrabarnett
__
Matthew Barnett added the comment:
Instead of a new flag, a '*' could be put after the quantifier, eg:
(\d+)(?:\.(\d+)){3}*
MatchObject.group(1) would be a string and MatchObject.group(2) would be
a list of strings.
The group references could be \g<1>, \g<2:0>, \g&l
Matthew Barnett added the comment:
I'm still tinkering with my regex engine (issue #2636).
Some timings:
re.compile(r'(\s+.*)*x').search('a ' * 25)
20.23secs
regex.compile(r'(\s+.*)*x').search('a ' * 25)
0.10secs
--
Matthew Barnett added the comment:
Surely this is to be expected when working with bytestrings. You should
be working in Unicode and using UTF-8 only for input and output.
--
nosy: +mrabarnett
___
Python tracker
<http://bugs.python.org/issue7
Matthew Barnett added the comment:
The problem with the shorthand form is that the generators use the
values that are bound to 'a' and 'p' when they are iterated, not when
they are created. You can test this by inserting:
a = "X"
just before the assert: y
Matthew Barnett added the comment:
issue2636-20100116.zip is a new version of the regex module.
I've given up on the breadth-wise matching - it was too difficult finding a
pattern structure that would work well for both depth-first and breadth-wise.
It probably still needs some tweak
Matthew Barnett added the comment:
"[9-A]" is equivalent to "[9:;<=>?...@a]", or should be.
It'll be fixed in issue #2636.
___
Python tracker
&l
Matthew Barnett added the comment:
issue2636-features-3.diff is based on the 2.x trunk.
Added comments.
Restricted line lengths to no more than 80 characters
Added common POSIX character classes like [[:alpha:]].
Added further checks to reduce unnecessary backtracking.
I've decided to r
Matthew Barnett added the comment:
issue2636-features-4.diff includes:
Bugfixes
msg74203: duplicate capture group numbers
msg74904: duplicate capture group names
Added file: http://bugs.python.org/file13185/issue2636-features-4.diff
___
Python tracker
Matthew Barnett added the comment:
The definition of a word in the new re module (actually targetted at
Python 2.7) is currently a sequence of L&, N&, M& and Pc.
I suppose ideally we want the definitions of a word and an identifier to
be basically the same, except that an iden
Matthew Barnett added the comment:
The usual trick is to append "_":
xhtmlNode('div',class_='sidebar')
Could you modify the function to remove the trailing "_"?
--
nosy: +mrabarnett
___
Python tr
Matthew Barnett added the comment:
The normal use of a keyword argument is to refer to a formal argument,
which is an identifier. Being able to wrap it up into a dict is a later
addition, and it's necessary to turn the identifier into a string
because it's not possible to use a bar
Matthew Barnett added the comment:
issue2636-features-5.diff includes:
Bugfixes
Added \G anchor (from Perl).
\G is the anchor at the start of a search, so re.search(r'\G(\w)') is
the same as re.match(r'(\w)').
re.findall normally performs a series of searches, eac
Matthew Barnett added the comment:
As part of issue #2636 group references now work in lookbehinds.
However, your example:
(?<=(...)\1)abc
will fail but:
(?<=\1(...))abc
will succeed.
Why? Well, in lookbehinds it searches backwards. In the first regex it
sees the group ref
Matthew Barnett added the comment:
issue2636-features-6.diff includes:
Bugfixes
Added group access via subscripting.
>>> m = re.search("(\D*)(?\d+)(\D*)", "abc123def")
>>> len(m)
4
>>> m[0]
'abc123def'
>>> m[1]
'abc&
Matthew Barnett added the comment:
At the moment binding occurs either right-to-left with "=", eg.
x = y
where "x" is the new name, or left-to-right, eg.
import x as y
where "y" is the new name.
If the order is to be right-to-left then using "a
Matthew Barnett added the comment:
Just for the record, I wasn't happy with "~=" either, and I have no
problem with just forgetting the whole idea.
--
___
Python tracker
<http://bugs.pytho
Matthew Barnett added the comment:
An additional feature that could be borrowed, though in slightly
modified form, from Perl is case-changing controls in replacement
strings. Roughly the idea is to add these forms to the replacement string:
\g<1> provides capture group 1
Matthew Barnett added the comment:
Ah, too Perlish! :-)
Another feature request that I've decided not to consider any further is
recursive regular expressions. There are other tools available for that
kind of thing, and I don't want the re module to go the way of Perl 6's
rul
Matthew Barnett added the comment:
There are 2 reasons:
1. I've been told that my current patches contain too many differences
from the current implementation, so basically I have to go back to the
start and introduce any changes a little at a time, without knowing
whether any parti
Matthew Barnett added the comment:
Patch issue2636-patch-1.diff contains a stripped down version of my
regex engine and the other changes that are necessary to make it work.
--
Added file: http://bugs.python.org/file13449/issue2636-patch-1.diff
Matthew Barnett added the comment:
FYI, I did tidy up the class and add a 'scaniter' method when I was
working on issue #2636; it might yet see the light of day if it gets the
go ahead!
--
nosy: +mrabarnett
___
Python trac
Matthew Barnett added the comment:
I implemented \p, \P and [:...:] for the simple categories (eg "Lu" and
"upper", but not "IsGreek") in the work I did for issue #2636.
--
nosy: +mrabarnett
___
Python tracker
<ht
Matthew Barnett added the comment:
One of the limitations is that it identifies what matched by using
capture groups, so if the expressions provided contain captures then it
gets confused! :-)
I handled that by 1) rejecting named captures and 2) changing unnamed
captures into non-captures
New submission from Matthew Barnett :
Patch idle-args.diff adds a dialog for entering command-line arguments
for a script from within IDLE itself.
--
components: IDLE
files: idle-args.diff
keywords: patch
messages: 85341
nosy: mrabarnett
severity: normal
status: open
title: Command-line
Matthew Barnett added the comment:
What do you mean "towards the end of the file"? What are the offsets of
the two lines? (I'm thinking it might be something to do with the \r\n
lying across a boundary, such as the 4GB boundary.)
--
no
New submission from Matthew Ahrens :
The "errno" module does not contain some error names/numbers that are
used on Solaris. Please add them.
from /usr/include/sys/errno.h:
#define ECANCELED 47/* Operation canceled */
#define ENOTSUP 48 /* Operation not
Matthew Barnett added the comment:
Try issue2636-patch-2.diff.
--
Added file: http://bugs.python.org/file13707/issue2636-patch-2.diff
___
Python tracker
<http://bugs.python.org/issue2
Matthew Iversen added the comment:
Skip, you were arguing in another csv issue on a NamedTupleReader that
the Reader and Writer should work in concert together.
Certainly, making this default functionality for DictWriter would
definitely make it work more in concert with DictReader.
A sample
Matthew Smart added the comment:
Woo!
On May 1, 2009 5:48 PM, "Gregory P. Smith" wrote:
Gregory P. Smith added the comment:
I merged ipaddr into py3k.
I can't lookup the revision number (r72186?) at the moment since
svn.python.org is having problems.
anyways, thanks pmoody
Matthew Barnett added the comment:
How about a 'full' form and a 'key' form generated by the function:
def codec_key(name):
return name.lower().replace("-", "").replace("_", "")
The key form would be the key to an available code
Matthew Barnett added the comment:
Well, there are multiple UTF encodings, so no to "utf".
Are there multiple Latin encodings? Not in Python 2.6.2 under those names.
I'd probably insist on names that are strictish(?), ie correct, give o
New submission from Matthew Wilson :
I do this kind of thing a lot:
>>> from datetime import timedelta
>>> td = timedelta(days=2, seconds=14)
>>> total_duration_in_seconds = td.days * 24 * 60 * 60 + td.seconds
I would like to have a property on the time
Matthew Barnett added the comment:
I agree that it's a bug.
A workaround is r'([xy])(?:\s{0,65534}\1)+'. A repeat of 65535 is
treated as unlimited (but no warning is given).
--
nosy: +mrabarnett
___
Python tracker
<http://bugs.py
601 - 700 of 828 matches
Mail list logo