[issue1160] Medium size regexp crashes python

2009-11-13 Thread Michael K Johnson

Michael K Johnson  added the comment:

I also ran into this issue, and dealt with it as suggested here by
changing to sets.  Because I have underlying code that has to deal both
with small hand-crafted regular expressions and arbitrarily-large
machine-generated sets of paths, I subclassed set and implemented the
match and search methods in my subclass so that the underlying code
would work both against the hand-generated regular expressions and the
machine-generated sets of paths.  Hope this helps someone else who runs
into this restriction.

--
nosy: +johnsonm

___
Python tracker 
<http://bugs.python.org/issue1160>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1160] Medium size regexp crashes python

2009-11-15 Thread Michael K Johnson

Michael K Johnson  added the comment:

The test case at the top of this issue reproduces just fine; if you are
looking for a different test case you'll have to specify what you don't
like about it so that it's clear what you are looking for.

I don't think there's any mystery about this issue; it seems perfectly
well understood.  I commented merely to encourage others who run into
this issue to consider one way of using sets if they are running into
the same case I was, in which I was trying to use a regular expression
to match a candidate string against a large set of exact matches.

I was doing this because the initial purpose of the interface I was
working with was to allow small, hand-specified regular expressions;
this interface was later additionally wrapped in code that automatically
created regular expressions for this interface originally (and still
also) intended for use with hand-crafted regular expressions.  That's
why the interface was not originally crafted to use sets, and why it was
not appropriate to simply change the interface to use sets.  However, my
interface also allows passing a callable which resolves the object at
the time of use, and so I merely passed a reference to a method which
returned an object derived from set but which implemented the match and
search methods.

If you REALLY want a simpler reproducer, this does it for me in the
restricted case (i.e., not using UCS4 encoding):
 import re
 r = re.compile('|'.join(('%d'%x for x in range(7000

But I really don't think that additional test cases are a barrier here.

Again, my goal was merely to suggest an easy way to use sets as a
replacement for regexps, for machine-generated regexps intended to match
against exact strings; subclass set and add necessary methods such as
search and/or match.

--

___
Python tracker 
<http://bugs.python.org/issue1160>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6254] tarfile unnecessarily requires seekable files

2009-06-10 Thread Michael K Johnson

New submission from Michael K Johnson :

In python 2.6 (not 2.4, haven't checked 2.5), the __init__() method of
the TarFile class calls the tell() method on the tar file, which doesn't
work if you are reading from standard input or writing to standard
output, two very reasonable things to do with a tar file.

While there are cases where it is logical to seek within a tar file,
supporting those cases should not preclude the normal design case for
tar archives of streaming reads/writes, including tar files being
streamed between processes via pipes.  If the tell() method is not
implemented for the file object, then the seek() method of TarFile (and
any other methods that can be implemented only for seekable files) can
raise a reasonable exception.  Note that this also means that the next()
method should not need to seek() for non-seekable files; it should
assume that it is at the correct block and read from there.

--
components: Library (Lib)
messages: 89206
nosy: johnsonm
severity: normal
status: open
title: tarfile unnecessarily requires seekable files
type: behavior
versions: Python 2.6

___
Python tracker 
<http://bugs.python.org/issue6254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6254] tarfile unnecessarily requires seekable files

2009-06-10 Thread Michael K Johnson

Michael K Johnson  added the comment:

We are doing output, and mode='w|' works.  We were using
tarfile.TarFile, not realizing that the default constructor was an
unsupported and deprecated interface (!?!)

--
status: open -> closed

___
Python tracker 
<http://bugs.python.org/issue6254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6254] tarfile unnecessarily requires seekable files

2009-06-11 Thread Michael K Johnson

Michael K Johnson  added the comment:

OK, not intended for "everyday use"; I understand this as meaning that
it is considered primarily an internal interface, and thus one that has
an explicitly unstable API.  It is hard for me to guess that this would
be the case, since this intent is not documented in the docstrings or
comments of either TarFile.__init__() or TarFile.open()

If I'm understanding you correctly, this could be considered a
documentation bug; perhaps the docstring for TarFile.__init__() could
suggest using the open() method, except possibly within TarFile subclasses?

Sorry to be so confused here.  I hope I'm finally converging on
understanding...

Anyway, thanks for the help!

--

___
Python tracker 
<http://bugs.python.org/issue6254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com