[issue3825] Major reworking of Python 2.5.2 re module

2010-08-21 Thread Georg Brandl
Georg Brandl added the comment: Work has gone on in #2636. -- nosy: +georg.brandl resolution: -> duplicate status: open -> closed superseder: -> Regexp 2.7 (modifications to current re 2.2.2) ___ Python tracker _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-24 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment: Matthew, I am really happy that you are making such progress on your engine, but can I PLEASE ask you to slow down for a moment? We have a lot of issues already listed in issue 2636 that is a catch-all for any Python 2.7 Regexp improvemen

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-23 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Patch regex_2.6rc2+6.diff is a bugfix. Added file: http://bugs.python.org/file11587/regex_2.6rc2+6.diff ___ Python tracker <[EMAIL PROTECTED]> __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-23 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Patch regex_2.6rc2+5.diff adds scoped and 'negative' flags for (?i), (?m) and (?s). The other flags remain unchanged in behaviour. See #433024, #433027 and #433028. Added file: http://bugs.python.org/file11585/regex_2.6rc2+5.diff _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-22 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Correction of regex_2.6rc2+4.diff. (Aargh!) Added file: http://bugs.python.org/file11559/regex_2.6rc2+4.diff ___ Python tracker <[EMAIL PROTECTED]> _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-22 Thread Matthew Barnett
Changes by Matthew Barnett <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11558/regex_2.6rc2+4.diff ___ Python tracker <[EMAIL PROTECTED]> ___ __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-22 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: regex_2.6rc2+4.diff fixes the ordering of the capture groups for reverse searching. Added file: http://bugs.python.org/file11558/regex_2.6rc2+4.diff ___ Python tracker <[EMAIL PROTECTED]>

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-21 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: regex_2.6rc2+3.diff adds reverse searching with the re.REVERSE/re.R and "(?r)" flag. This gives results such as: >>> re.findall("(\w+)", "one two three") ['one', 'two', 'three'] >>> re.findall("(?r)(\w+)", "one two three") ['three', 'two',

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-21 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Needed to correct regex_2.6rc2+2.diff. Added file: http://bugs.python.org/file11553/regex_2.6rc2+2.diff ___ Python tracker <[EMAIL PROTECTED]> __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-21 Thread Matthew Barnett
Changes by Matthew Barnett <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11552/regex_2.6rc2+2.diff ___ Python tracker <[EMAIL PROTECTED]> ___ __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-21 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: regex_2.6rc2+2.diff is a bugfix for capture groups in look-behinds. Added file: http://bugs.python.org/file11552/regex_2.6rc2+2.diff ___ Python tracker <[EMAIL PROTECTED]>

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-21 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Fixed the matching of word boundaries when searching and matching in substrings. Added file: http://bugs.python.org/file11543/regex_2.6rc2+1.diff ___ Python tracker <[EMAIL PROTECTED]>

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-20 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Bugfix. Added file: http://bugs.python.org/file11532/regex_2.6rc2.diff ___ Python tracker <[EMAIL PROTECTED]> ___ ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-20 Thread Matthew Barnett
Changes by Matthew Barnett <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11530/regex_2.6rc2.diff ___ Python tracker <[EMAIL PROTECTED]> ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-20 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: This patch is now based on Python 2.6rc2. I've reduced the number of macros and used functions instead, provided that it didn't cost much in terms of speed. In many cases it should be faster than the current release, and at worst no slower.

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment: I have uploaded my test cases for Atomic Grouping / Possessive Qualifier, which is the common code we seem to have developed, as this may be of use to you. I also have documentation, but for now, would you mind running these tests against

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: _sre.c is over 6000, but it does contain macros. I didn't have this problem when based on Python 2.5.2 in Express 2005. ___ Python tracker <[EMAIL PROTECTED]> ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc <[EMAIL PROTECTED]> added the comment: Do you have some big source files, of more than 1 lines? ___ Python tracker <[EMAIL PROTECTED]> ___ _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Used Visual C++ Express 2005 and the PC\VS8.0 directory. Same problem. ___ Python tracker <[EMAIL PROTECTED]> ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc <[EMAIL PROTECTED]> added the comment: If you use Visual C++ Express 2005, you can build python from the PC\VS8.0 directory. ___ Python tracker <[EMAIL PROTECTED]> ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: I know what you mean about the dependencies! My current problem is that now I'm working with the current trunk, which means using Visual C++ Express 2008 instead of 2005. When debugging it's behaving like the debug info is out of date (showi

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-15 Thread Jeffrey C. Jacobs
Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment: Well, I implemented this months ago, but have been busy with other things so I haven't updated in a while. I noticed that the current version is missing my patches for Atomic Grouping / Possessive Qualifiers and a number of other patches I

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-14 Thread Giampaolo Rodola'
Changes by Giampaolo Rodola' <[EMAIL PROTECTED]>: -- nosy: +giampaolo.rodola ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Fredrik Lundh
Fredrik Lundh <[EMAIL PROTECTED]> added the comment: A bit more information on the changes to the core engine that are responsible for the 2x speedup (on what?) would be nice to have, I think (especially since you seem to have removed the KMP prefix scanner). (Isn't there a RE benchmark suite so

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Benjamin Peterson
Changes by Benjamin Peterson <[EMAIL PROTECTED]>: -- nosy: -benjamin.peterson ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bug

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Corrected the diff file, again. :-( The atomic groups and possessive quantifiers are as described at http://www.regular-expressions.info. Added file: http://bugs.python.org/file11484/regex_2.5.2.diff ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Matthew Barnett
Changes by Matthew Barnett <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11451/regex_2.5.2.diff ___ Python tracker <[EMAIL PROTECTED]> ___ _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: By the way, the patch must be pretty incomplete, since there are almost no changes to _sre.c. Am I missing something? ___ Python tracker <[EMAIL PROTECTED]> __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-12 Thread Terry J. Reedy
Terry J. Reedy <[EMAIL PROTECTED]> added the comment: Atomic groups and possessive quantifiers appear to be relatively new: http://en.wikipedia.org/wiki/Regular_expressions for instance, has no mention of either that I found. http://www.regular-expressions.info/atomic.html http://www.regular-exp

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-10 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: This is different work from a different author than #2636. I've submitted what I've done so far in case my computer gets hit by a bus. :-) I still have more work to do on it, so I'm not concerned that it might not get any attention for a whil

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-10 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc <[EMAIL PROTECTED]> added the comment: The correct link is #2636. Is it the same work? -- nosy: +amaury.forgeotdarc ___ Python tracker <[EMAIL PROTECTED]> ___

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Matthew Barnett
Matthew Barnett <[EMAIL PROTECTED]> added the comment: Corrected the diff file. I worked from Python 2.5.2 because that's what I'm currently using. I'll work from the trunk in future. Added file: http://bugs.python.org/file11451/regex_2.5.2.diff ___ Python tr

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Matthew Barnett
Changes by Matthew Barnett <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11447/regex_2.5.2.diff ___ Python tracker <[EMAIL PROTECTED]> ___ _

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Gregory P. Smith
Gregory P. Smith <[EMAIL PROTECTED]> added the comment: weird typo: s/f lea/formats/ ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Gregory P. Smith
Gregory P. Smith <[EMAIL PROTECTED]> added the comment: This sounds really neat. but as Anotine said it'll be several weeks before any of us can give this serious attention. Definitely update to trunk and base your work off of that. quick comments: Your _sre.c diff appears to remove and rep

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Hi, This looks impressive. You should really work from the current SVN trunk, not from the 2.5 sources, as there were some additions to the re module (mainly, a bytecode verifier contributed by Google). Also, if it can be split into several

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Benjamin Peterson
Benjamin Peterson <[EMAIL PROTECTED]> added the comment: Very interesting! Have you seen #3626? Another thing to note is that this will have to wait for 2.7 before it could potentially be integrated into the trunk. -- nosy: +benjamin.peterson versions: +Python 2.7 -Python 2.5 __

[issue3825] Major reworking of Python 2.5.2 re module

2008-09-09 Thread Matthew Barnett
New submission from Matthew Barnett <[EMAIL PROTECTED]>: This is a major reworking of the re module in Python 2.5.2. Added atomic groups. Added possessive quantifiers. Lookbehinds can now be variable length. Typically x2 faster. More changes to follow. -- components: Regular Expression