[issue1160] Medium size regexp crashes python
Michael K Johnson added the comment: I also ran into this issue, and dealt with it as suggested here by changing to sets. Because I have underlying code that has to deal both with small hand-crafted regular expressions and arbitrarily-large machine-generated sets of paths, I subclassed set and implemented the match and search methods in my subclass so that the underlying code would work both against the hand-generated regular expressions and the machine-generated sets of paths. Hope this helps someone else who runs into this restriction. -- nosy: +johnsonm ___ Python tracker <http://bugs.python.org/issue1160> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1160] Medium size regexp crashes python
Michael K Johnson added the comment: The test case at the top of this issue reproduces just fine; if you are looking for a different test case you'll have to specify what you don't like about it so that it's clear what you are looking for. I don't think there's any mystery about this issue; it seems perfectly well understood. I commented merely to encourage others who run into this issue to consider one way of using sets if they are running into the same case I was, in which I was trying to use a regular expression to match a candidate string against a large set of exact matches. I was doing this because the initial purpose of the interface I was working with was to allow small, hand-specified regular expressions; this interface was later additionally wrapped in code that automatically created regular expressions for this interface originally (and still also) intended for use with hand-crafted regular expressions. That's why the interface was not originally crafted to use sets, and why it was not appropriate to simply change the interface to use sets. However, my interface also allows passing a callable which resolves the object at the time of use, and so I merely passed a reference to a method which returned an object derived from set but which implemented the match and search methods. If you REALLY want a simpler reproducer, this does it for me in the restricted case (i.e., not using UCS4 encoding): import re r = re.compile('|'.join(('%d'%x for x in range(7000 But I really don't think that additional test cases are a barrier here. Again, my goal was merely to suggest an easy way to use sets as a replacement for regexps, for machine-generated regexps intended to match against exact strings; subclass set and add necessary methods such as search and/or match. -- ___ Python tracker <http://bugs.python.org/issue1160> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6254] tarfile unnecessarily requires seekable files
New submission from Michael K Johnson : In python 2.6 (not 2.4, haven't checked 2.5), the __init__() method of the TarFile class calls the tell() method on the tar file, which doesn't work if you are reading from standard input or writing to standard output, two very reasonable things to do with a tar file. While there are cases where it is logical to seek within a tar file, supporting those cases should not preclude the normal design case for tar archives of streaming reads/writes, including tar files being streamed between processes via pipes. If the tell() method is not implemented for the file object, then the seek() method of TarFile (and any other methods that can be implemented only for seekable files) can raise a reasonable exception. Note that this also means that the next() method should not need to seek() for non-seekable files; it should assume that it is at the correct block and read from there. -- components: Library (Lib) messages: 89206 nosy: johnsonm severity: normal status: open title: tarfile unnecessarily requires seekable files type: behavior versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue6254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6254] tarfile unnecessarily requires seekable files
Michael K Johnson added the comment: We are doing output, and mode='w|' works. We were using tarfile.TarFile, not realizing that the default constructor was an unsupported and deprecated interface (!?!) -- status: open -> closed ___ Python tracker <http://bugs.python.org/issue6254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6254] tarfile unnecessarily requires seekable files
Michael K Johnson added the comment: OK, not intended for "everyday use"; I understand this as meaning that it is considered primarily an internal interface, and thus one that has an explicitly unstable API. It is hard for me to guess that this would be the case, since this intent is not documented in the docstrings or comments of either TarFile.__init__() or TarFile.open() If I'm understanding you correctly, this could be considered a documentation bug; perhaps the docstring for TarFile.__init__() could suggest using the open() method, except possibly within TarFile subclasses? Sorry to be so confused here. I hope I'm finally converging on understanding... Anyway, thanks for the help! -- ___ Python tracker <http://bugs.python.org/issue6254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com