Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Martin v. Löwis
Am 27.08.2011 08:33, schrieb Terry Reedy:
> On 8/26/2011 9:56 PM, Antoine Pitrou wrote:
> 
>> Another "interesting" question is whether it's easy to port to the PEP
>> 393 string representation, if it gets accepted.
> 
> Will the re module need porting also?

That's a quality-of-implementation issue (in both cases). In principle,
the modules should continue to work unmodified, and indeed SRE does.
However, the module will then match on Py_UNICODE, which may be
expensive to produce, and may not meet your expectations of surrogate
pair handling.

So realistically, the module should be ported, which has the challenge
that matching needs to operate on three different representations. The
modules already support two representations (unsigned char and
Py_UNICODE), but probably switching on type, not on state.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-27 Thread Steven D'Aprano

Terry Reedy wrote:

On 8/26/2011 8:23 PM, Antoine Pitrou wrote:


I would only agree as long as it wasn't too much worse
than O(1). O(log n) might be all right, but O(n) would be
unacceptable, I think.


It also depends a lot on *actual* measured performance


Amen. Some regard O(n*n) sorts to be, by definition, 'worse' than 
O(n*logn). I even read that in an otherwise good book by a university 
professor. Fortunately for Python users, Tim Peters ignored that 
'wisdom', coded the best O(n*n) sort he could, and then *measured* to 
find out what was better for what types and lengths of arrays. So not we 
have a list.sort that sometimes beats the pure O(nlog) quicksort of C 
libraries.


A nice story, but Quicksort's worst case is O(n*n) too.

http://en.wikipedia.org/wiki/Quicksort

timsort is O(n) in the best case (all items already in order).

You are right though about Tim Peters doing extensive measurements:

http://bugs.python.org/file4451/timsort.txt

If you haven't read the whole thing, do so. I am in awe -- not just 
because he came up with the algorithm, but because of the discipline Tim 
demonstrated in such detailed testing. A far cry from a couple of timeit 
runs on short-ish lists.




--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-27 Thread Martin v. Löwis
Am 27.08.2011 09:40, schrieb Steven D'Aprano:
> Terry Reedy wrote:
>> On 8/26/2011 8:23 PM, Antoine Pitrou wrote:
>>
 I would only agree as long as it wasn't too much worse
 than O(1). O(log n) might be all right, but O(n) would be
 unacceptable, I think.
>>>
>>> It also depends a lot on *actual* measured performance
>>
>> Amen. Some regard O(n*n) sorts to be, by definition, 'worse' than
>> O(n*logn). I even read that in an otherwise good book by a university
>> professor. Fortunately for Python users, Tim Peters ignored that
>> 'wisdom', coded the best O(n*n) sort he could, and then *measured* to
>> find out what was better for what types and lengths of arrays. So not
>> we have a list.sort that sometimes beats the pure O(nlog) quicksort of
>> C libraries.
> 
> A nice story, but Quicksort's worst case is O(n*n) too.

In addition, timsort is O(n log n), which also makes it a real good
O(n*n) sort :-)

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Nick Coghlan
On Sat, Aug 27, 2011 at 4:01 PM, Dan Stromberg  wrote:
> You're talking technically, which is important, but wasn't what I was
> suggesting would be helped.
>
> Politically, and from a marketing standpoint, it's easier to withdraw a
> feature you've given with a "Play with this, see if it works for you"
> warning.

The standard library isn't for playing. "pip install regex" is for
playing. If we aren't sure we want to make the transition, then it
doesn't go in.

However, to my mind, reviewing and incorporating regex is a far more
feasible model than trying to enhance the existing re module with a
comparable feature set. At the moment, there's already an obvious way
to get enhanced regex support in Python: install regex and use it
instead of the standard library's re module. That's enough to pretty
much kill any motivation anyone might have to make major changes to re
itself.

We're at least getting one thing right this time that we got wrong
with multiprocessing, though - we're much, much further out from the
3.3 release than we were from the 2.6 release when multiprocessing was
added to the standard library :)

The next step needed is for someone to volunteer to write and champion
a PEP that:
- articulates the deficiencies in the current re module (the regex
docs already cover some of this, as do Tom Christiansen's notes on the
issue tracker)
- explains why upgrading re in place is not feasible (e.g. noting that
the availability of regex really limits the desire for anyone to
reinvent that particular wheel, so even things that are theoretically
possible may be highly unlikely in practice)
- proposes a transition plan (personally, I'd be fine with an optparse
-> argparse style transition where re remains around indefinitely to
support legacy code, but new users are pointed towards regex. But
depending on compatibility details, merging the two APIs in the
existing re namespace may also be feasible)
- proposes a maintenance strategy (I don't know how much Matthew has
written regarding internal design details, but that kind of thing
could really help. Matthew agreeing to continue maintenance as part of
the standard library would also help a great deal, but wouldn't be
enough on its own - while it's good for modules to have active
maintainers to make the final call associated design decisions, it's
potentially problematic when other core developers don't understand
what the code is doing well enough to fix bugs in it)
- confirms that the regex test suite can be incorporated cleanly into
the standard library regression test suite (the difficulty of this was
something that was underestimated for the inclusion of
multiprocessing. Test suite integration is also the final sticking
point holding up the PEP 380 'yield from' patch, although that's close
to being resolved following the PyConAU sprints)
- document tests conducted (e.g. micro-benchmark results, fusil results)

PEP 371 (addition of multiprocessing), PEP 389 (addition of argparse)
and Jesse's reflections on the way multiprocessing was added
(http://jessenoller.com/2009/01/28/multiprocessing-in-hindsight/) are
well worth reading for anyone considering stepping up to write a PEP.
That last also highlights why even Matthew's support, however capably
he has handled maintenance of regex as an independent project,
wouldn't be enough - we had Richard Oudkerk's support and agreement to
continue maintenance as the original author of multiprocessing, but he
became unavailable early in the integration process. If Jesse hadn't
been able to take up most of that slack, the likely result would have
been reversion of the changes and removal of multiprocessing from the
2.6 release.

Writing PEPs can be quite a frustrating experience (since a lot of
feedback will be negative as people try to poke holes in the idea to
see if it stands up to close scrutiny), but it's also really
satisfying and rewarding if they end up getting accepted and
incorporated :)

>> Have then been any __future__ features that were added provisionally?
>
> I can't either, but ISTR hearing that from __future__ import was started
> with such an intent.  Irrespective, it's hard to import something from
> "future" without at least suspecting that you're on the bleeding edge.

No, we make an explicit guarantee that future imports will never go
away once they've been added. They may become redundant, but they
won't break. There's no provision in the future mechanism for changes
that are added and then later removed (see
http://docs.python.org/dev/library/__future__).

They're strictly for cases where backwards incompatibilities (usually,
but not always, new keywords) may break existing code.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issue 6721 "Locks in python standard library should be sanitized on fork"

2011-08-27 Thread Ask Solem

On 26 Aug 2011, at 16:53, Antoine Pitrou wrote:

> 
> Hi,
> 
>> I think that "deprecating" the use of threads w/ multiprocessing - or
>> at least crippling it is the wrong answer. Multiprocessing needs the
>> helper threads it uses internally to manage queues, etc. Removing that
>> ability would require a near-total rewrite, which is just a
>> non-starter.
> 
> I agree that this wouldn't actually benefit anyone.
> Besides, I don't think it's even possible to avoid threads in
> multiprocessing, given the various constraints. We would have to force
> the user to run their main thread in an event loop, and that would be
> twisted (tm).
> 
>> I would focus on the atfork() patch more directly, ignoring
>> multiprocessing in the discussion, and focusing on the merits of gps'
>> initial proposal and patch.
> 
> I think this could also be combined with Charles-François' patch.
> 
> Regards



Have to agree with Jesse and Antoine here.

Celery (celeryproject.org) uses multiprocessing, is wildly used in production,
and is regarded as stable software that have been known to run for months at a 
time
only to be restarted for software upgrades.

I have been investigating an issue for some time, that I'm pretty sure is caused
by this.  It occurs only rarely, so rarely I have not had any actual bug reports
about it, it's just something I have experienced during extensive testing.
The tone of the discussion on the bug tracker makes me think that I have
been very lucky :-)

Using the fork+exec approach seems like a much more realistic solution
than rewriting multiprocessing.Pool and Manager to not use threads. In fact
this is something I have been considering as a fix for the suspected
issue for for some time.
It does have implications that are annoying for sure, but we are already
used to this on the Windows platform (it could help portability even).

-- 
Ask Solem
twitter.com/asksol | +44 (0)7713357179

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 09:18:14 +0200
"Martin v. Löwis"  wrote:
> Am 27.08.2011 08:33, schrieb Terry Reedy:
> > On 8/26/2011 9:56 PM, Antoine Pitrou wrote:
> > 
> >> Another "interesting" question is whether it's easy to port to the PEP
> >> 393 string representation, if it gets accepted.
> > 
> > Will the re module need porting also?
> 
> That's a quality-of-implementation issue (in both cases). In principle,
> the modules should continue to work unmodified, and indeed SRE does.
> However, the module will then match on Py_UNICODE, which may be
> expensive to produce, and may not meet your expectations of surrogate
> pair handling.
> 
> So realistically, the module should be ported, which has the challenge
> that matching needs to operate on three different representations. The
> modules already support two representations (unsigned char and
> Py_UNICODE), but probably switching on type, not on state.

>From what I've seen, re generates two different sets of functions at
compile-time (with a stringlib-like approach), while regex has a
run-time flag to choose between the two representations (where,
interestingly, the two code paths are explicitly spelled, almost
duplicate of each other).
Matthew, please correct me if I'm wrong.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 08:02:31 +0200
"Martin v. Löwis"  wrote:
> > I'm not sure it's worth doing an extensive review of the code, a better
> > approach might be to require extensive test coverage  (and a review of
> > tests).
> 
> I think it's worth. It's really bad if only one developer fully
> understands the regex implementation.

Could such a review be the topic of an informational PEP?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Armin Rigo
Hi all,

About multithreading models: I recently made an observation which
might be obvious to some, but not to me, and as far as I know not to
most of us either.  I think that it's worth being pointed out :-)

http://mail.python.org/pipermail/pypy-dev/2011-August/008153.html


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread exarkun

On 26 Aug, 09:45 pm, gu...@python.org wrote:

I just made a pass of all the Unicode-related bugs filed by Tom
Christiansen, and found that in several, the response was "this is
fixed in the regex module [by Matthew Barnett]". I started replying
that I thought that we should fix the bugs in the re module (i.e.,
really in _sre.c) but on second thought I wonder if maybe regex is
mature enough to replace re in Python 3.3. It would mean that we won't
fix any of these bugs in earlier Python versions, but I could live
with that.

However, I don't know much about regex -- how compatible is it, how
fast is it (including extreme cases where the backtracking goes
crazy), how bug-free is it, and so on. Plus, how much work would it be
to actually incorporate it into CPython as a complete drop-in
replacement of the re package (such that nobody needs to change their
imports or the flags they pass to the re module).

We'd also probably have to train some core developers to be familiar
enough with the code to maintain and evolve it -- I assume we can't
just volunteer Matthew to do so forever... :-)

What's the alternative? Is adding the requested bug fixes and new
features to _sre.c really that hard?


What about other Python implementations (ie, PEP 399)?  For this to be 
seriously considered, shouldn't there also be a pure Python 
implementation of the functionality?


Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Nick Coghlan
On Sat, Aug 27, 2011 at 8:45 PM, Armin Rigo  wrote:
> Hi all,
>
> About multithreading models: I recently made an observation which
> might be obvious to some, but not to me, and as far as I know not to
> most of us either.  I think that it's worth being pointed out :-)
>
> http://mail.python.org/pipermail/pypy-dev/2011-August/008153.html

Having a context manager to say "don't release the GIL" for a bit
could actually be really nice (e.g. for implementing builtin-style
method semantics for data types written in Python).

However, two immediate questions come to mind:

1. How does the patch interact with C code that explicitly releases
the GIL? (e.g. IO commands inside a "with atomic:" block)
2. Whether or not Jython and IronPython could implement something like
that, since they're free threaded with fine-grained locks. If they
can't then I don't see how we could justify making it part of the
standard library.

Interesting idea, though :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Armin Rigo
Hi Nick,

On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan  wrote:
> 1. How does the patch interact with C code that explicitly releases
> the GIL? (e.g. IO commands inside a "with atomic:" block)

As implemented, any code in a "with atomic" is prevented from
explicitly releasing and reacquiring the GIL: the GIL remain acquired
until the end of the "with" block.  In other words
Py_BEGIN_ALLOW_THREADS has no effect in a "with" block.  This gives
semantics that, in a full multi-core STM world, would be implementable
by saying that if, in the middle of a transaction, you need to do I/O,
then from this point onwards the transaction is not allowed to abort
any more.  Such "inevitable" transactions are already supported e.g.
by RSTM, the C++ framework I used to prototype a C version
(https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ).

> 2. Whether or not Jython and IronPython could implement something like
> that, since they're free threaded with fine-grained locks. If they
> can't then I don't see how we could justify making it part of the
> standard library.

Yes, I can imagine some solutions.  I am no Jython or IronPython
expert, but let us assume that they have a way to check synchronously
for external events from time to time (i.e. if there is some
equivalent to sys.setcheckinterval()).  If they do, then all you need
is the right synchronization: the thread that wants to start a "with
atomic" has to wait until all other threads are paused in the external
check code.  (Again, like CPython's, this not a properly multi-core
STM-ish solution, but it would give the right semantics.  (And if it
turns out that STM is successful in the future, Java will grow more
direct support for it ))


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
Hello all,

I'd like to propose the addition of a new module in Python 3.3. The 'lzma'
module will provide support for compression and decompression using the LZMA
algorithm, and the .xz and .lzma file formats. The matter has already been
discussed on the tracker , where there seems
to be a consensus that this is a desirable feature. What are your thoughts?

The proposed module's API will be very similar to that of the bz2 module;
the only differences will be additional keyword arguments to some functions,
for specifying container formats and detailed compressor options.

The implementation will also be similar to bz2 - basic compressor and
decompressor classes written in C, with convenience functions and a file
interface implemented on top of those in Python.

I've already done some work on the C parts of the module; I'll push that to my
sandbox  in the next day or two.

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Ross Lagerwall
> I'd like to propose the addition of a new module in Python 3.3. The 'lzma'
> module will provide support for compression and decompression using the LZMA
> algorithm, and the .xz and .lzma file formats. The matter has already been
> discussed on the tracker , where there seems
> to be a consensus that this is a desirable feature. What are your thoughts?
> 
> The proposed module's API will be very similar to that of the bz2 module;
> the only differences will be additional keyword arguments to some functions,
> for specifying container formats and detailed compressor options.

+1 for adding and +1 for keeping a similar interface.

Cheers
Ross

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 15:08:36 +0200
Armin Rigo  wrote:
> Hi Nick,
> 
> On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan  wrote:
> > 1. How does the patch interact with C code that explicitly releases
> > the GIL? (e.g. IO commands inside a "with atomic:" block)
> 
> As implemented, any code in a "with atomic" is prevented from
> explicitly releasing and reacquiring the GIL: the GIL remain acquired
> until the end of the "with" block.  In other words
> Py_BEGIN_ALLOW_THREADS has no effect in a "with" block.

You then risk deadlocks. Say:
- thread A is inside a "with atomic" and calls a library function which
  tries to take lock L
- thread B has already taken lock L and is currently executing an I/O
  function with GIL released
- thread B then waits for the GIL (and hence depends on thread A going
  forward), while thread A waits for lock L (and hence depends on
  thread B going forward)

Lock L could simply be the lock used by the file object  (a
Buffered{Reader,Writer,Random}) which thread B is reading or writing
from.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> The implementation will also be similar to bz2 - basic compressor and
> decompressor classes written in C, with convenience functions and a file
> interface implemented on top of those in Python.

When I reviewed lzma, I found that this approach might not be
appropriate. lzma has many more options and aspects that allow tuning
and selection, and a Python LZMA library should provide the same feature
set as the underlying C library.

So I would propose that a very thin C layer is created around the C
library that focuses on the actual algorithms, and that any higher
layers (in particular file formats) are done in Python.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
On Sat, Aug 27, 2011 at 4:50 PM, "Martin v. Löwis"  wrote:
>> The implementation will also be similar to bz2 - basic compressor and
>> decompressor classes written in C, with convenience functions and a file
>> interface implemented on top of those in Python.
>
> When I reviewed lzma, I found that this approach might not be
> appropriate. lzma has many more options and aspects that allow tuning
> and selection, and a Python LZMA library should provide the same feature
> set as the underlying C library.
>
> So I would propose that a very thin C layer is created around the C
> library that focuses on the actual algorithms, and that any higher
> layers (in particular file formats) are done in Python.

I probably shouldn't have used the word "basic" here - these classes expose all
the features of the underlying library. I was rather trying to underscore that
the rest of the module is implemented in terms of these two classes.

As for file formats, these are handled by liblzma itself; the extension module
just selects which compressor/decompressor initializer function to use depending
on the value of the "format" argument. Our code won't contain anything along the
lines of GzipFile; all of that work is done by the underlying C library. Rather,
the LZMAFile class will be like BZ2File - just a simple filter that passes the
read/written data through a LZMACompressor or LZMADecompressor as appropriate.

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> As for file formats, these are handled by liblzma itself; the extension module
> just selects which compressor/decompressor initializer function to use 
> depending
> on the value of the "format" argument. Our code won't contain anything along 
> the
> lines of GzipFile; all of that work is done by the underlying C library. 
> Rather,
> the LZMAFile class will be like BZ2File - just a simple filter that passes the
> read/written data through a LZMACompressor or LZMADecompressor as appropriate.

This is exactly what I worry about. I think adding file I/O to bz2 was a
mistake, as this doesn't integrate with Python's IO library (it used
to, but now after dropping stdio, they were incompatible. Indeed, for
Python 3.2, BZ2File has been removed from the C module, and lifted to
Python.

IOW, the _lzma C module must not do any I/O, neither directly nor
indirectly (through liblzma). The approach of gzip.py (doing IO
and file formats in pure Python) is exactly right.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
On Sat, Aug 27, 2011 at 5:15 PM, "Martin v. Löwis"  wrote:
>> As for file formats, these are handled by liblzma itself; the extension 
>> module
>> just selects which compressor/decompressor initializer function to use 
>> depending
>> on the value of the "format" argument. Our code won't contain anything along 
>> the
>> lines of GzipFile; all of that work is done by the underlying C library. 
>> Rather,
>> the LZMAFile class will be like BZ2File - just a simple filter that passes 
>> the
>> read/written data through a LZMACompressor or LZMADecompressor as 
>> appropriate.
>
> This is exactly what I worry about. I think adding file I/O to bz2 was a
> mistake, as this doesn't integrate with Python's IO library (it used
> to, but now after dropping stdio, they were incompatible. Indeed, for
> Python 3.2, BZ2File has been removed from the C module, and lifted to
> Python.
>
> IOW, the _lzma C module must not do any I/O, neither directly nor
> indirectly (through liblzma). The approach of gzip.py (doing IO
> and file formats in pure Python) is exactly right.

It is not my intention for the _lzma C module to do I/O - that will be done by
the LZMAFile class, which will be written in Python. My comparison with bz2 was
in reference to the state of the module after it was rewritten for issue 5863.

Saying "anything along the lines of GzipFile" was a bad choice of wording; what
I meant is that the LZMAFile class won't handle the problem of picking apart the
.xz and .lzma container formats. That is handled by liblzma (operating entirely
on in-memory buffers). It will do _only_ I/O, in a similar fashion to
the BZ2File
class (as of changeset 2cb07a46f4b5, to avoid ambiguity ;) ).

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nick Coghlan
On Sun, Aug 28, 2011 at 1:15 AM, "Martin v. Löwis"  wrote:
> This is exactly what I worry about. I think adding file I/O to bz2 was a
> mistake, as this doesn't integrate with Python's IO library (it used
> to, but now after dropping stdio, they were incompatible. Indeed, for
> Python 3.2, BZ2File has been removed from the C module, and lifted to
> Python.
>
> IOW, the _lzma C module must not do any I/O, neither directly nor
> indirectly (through liblzma). The approach of gzip.py (doing IO
> and file formats in pure Python) is exactly right.

PEP 399 also comes into play - we need a pure Python version for PyPy
et al (or a plausible story for why an exception should be granted).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> It is not my intention for the _lzma C module to do I/O - that will be done by
> the LZMAFile class, which will be written in Python. My comparison with bz2 
> was
> in reference to the state of the module after it was rewritten for issue 5863.

Ok. I'll defer my judgement then until actual code is to review.

Not sure whether you already have this: supporting the tarfile module
would be nice.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Antoine Pitrou
On Sun, 28 Aug 2011 01:36:50 +1000
Nick Coghlan  wrote:
> On Sun, Aug 28, 2011 at 1:15 AM, "Martin v. Löwis"  wrote:
> > This is exactly what I worry about. I think adding file I/O to bz2 was a
> > mistake, as this doesn't integrate with Python's IO library (it used
> > to, but now after dropping stdio, they were incompatible. Indeed, for
> > Python 3.2, BZ2File has been removed from the C module, and lifted to
> > Python.
> >
> > IOW, the _lzma C module must not do any I/O, neither directly nor
> > indirectly (through liblzma). The approach of gzip.py (doing IO
> > and file formats in pure Python) is exactly right.
> 
> PEP 399 also comes into play - we need a pure Python version for PyPy
> et al (or a plausible story for why an exception should be granted).

The plausible story being that we basically wrap an existing library?
I don't think PyPy et al have pure Python versions of the zlib or
OpenSSL, do they?

If we start taking PEP 399 conformance to such levels, we might as well
stop developing CPython.

cheers

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nick Coghlan
On Sun, Aug 28, 2011 at 1:40 AM, Antoine Pitrou  wrote:
> On Sun, 28 Aug 2011 01:36:50 +1000
> Nick Coghlan  wrote:
>> On Sun, Aug 28, 2011 at 1:15 AM, "Martin v. Löwis"  
>> wrote:
>> > This is exactly what I worry about. I think adding file I/O to bz2 was a
>> > mistake, as this doesn't integrate with Python's IO library (it used
>> > to, but now after dropping stdio, they were incompatible. Indeed, for
>> > Python 3.2, BZ2File has been removed from the C module, and lifted to
>> > Python.
>> >
>> > IOW, the _lzma C module must not do any I/O, neither directly nor
>> > indirectly (through liblzma). The approach of gzip.py (doing IO
>> > and file formats in pure Python) is exactly right.
>>
>> PEP 399 also comes into play - we need a pure Python version for PyPy
>> et al (or a plausible story for why an exception should be granted).
>
> The plausible story being that we basically wrap an existing library?
> I don't think PyPy et al have pure Python versions of the zlib or
> OpenSSL, do they?
>
> If we start taking PEP 399 conformance to such levels, we might as well
> stop developing CPython.

It's acceptable for the Python version to use ctypes in the case of
wrapping an existing library, but the Python version should still
exist.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
On Sat, Aug 27, 2011 at 5:42 PM, "Martin v. Löwis"  wrote:
> Not sure whether you already have this: supporting the tarfile module
> would be nice.

Yes, got that - issue 5689. Also of interest is issue 5411 - adding .xz
support to distutils. But I think that these are separate projects that
should wait until the lzma module is finalized.

On Sat, Aug 27, 2011 at 5:40 PM, Antoine Pitrou  wrote:
> On Sun, 28 Aug 2011 01:36:50 +1000
> Nick Coghlan  wrote:
>> PEP 399 also comes into play - we need a pure Python version for PyPy
>> et al (or a plausible story for why an exception should be granted).
>
> The plausible story being that we basically wrap an existing library?
> I don't think PyPy et al have pure Python versions of the zlib or
> OpenSSL, do they?
>
> If we start taking PEP 399 conformance to such levels, we might as well
> stop developing CPython.

Indeed, PEP 399 specifically notes that exemptions can be granted for
modules that wrap external C libraries.

On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan  wrote:
> It's acceptable for the Python version to use ctypes in the case of
> wrapping an existing library, but the Python version should still
> exist.

I'm not too sure about that - PEP 399 explicitly says that using ctypes is
frowned upon, and doesn't mention anywhere that it should be used in this
sort of situation.

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nick Coghlan
On Sun, Aug 28, 2011 at 1:58 AM, Nadeem Vawda  wrote:
> On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan  wrote:
>> It's acceptable for the Python version to use ctypes in the case of
>> wrapping an existing library, but the Python version should still
>> exist.
>
> I'm not too sure about that - PEP 399 explicitly says that using ctypes is
> frowned upon, and doesn't mention anywhere that it should be used in this
> sort of situation.

Note to self: do not comment on python-dev at 2 am, as one's ability
to read PEPs correctly apparently suffers :)

Consider my comment withdrawn, you're quite right that PEP 399
actually says this is precisely the case where an exemption is a
reasonable idea. Although I believe it's likely that PyPy will wrap it
with ctypes anyway :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Armin Rigo
Hi Antoine,

> You then risk deadlocks. Say:
> (...)

Yes, it is indeed not a solution that co-operates transparently and
deadlock-freely with regular locks.  You risk the same kind of
deadlocks as you would when using only locks.  The reason is similar
to threads that try to acquire two locks in succession.  In your
example:

> - thread A is inside a "with atomic" and calls a library function which
>   tries to take lock L

This is basically dangerous, because it corresponds to taking lock
"GIL" and lock L, in that order, whereas the thread B takes lock L and
plays around with lock "GIL" in the opposite order.  I think a
reasonable solution to avoid deadlocks is simply not to use explicit
locks inside "with atomic" blocks.

Generally speaking it can be regarded as wrong to do any action that
causes an unbounded wait in a "with atomic" block, but the solution I
chose to implement in my patch is to still allow them, because it
doesn't make much sense to say that "print" or "pdb.set_trace()" are
forbidden.


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 11:01 PM, Dan Stromberg  wrote:
[Steven]
>> Have then been any __future__ features that were added provisionally?
>
> I can't either, but ISTR hearing that from __future__ import was started
> with such an intent.  Irrespective, it's hard to import something from
> "future" without at least suspecting that you're on the bleeding edge.

No, this was not the intent of __future__. The intent is that a
feature is desirable but also backwards incompatible (e.g. introduces
a new keyword) so that for 1 (sometimes more) releases we require the
users to use the __future__ import.

There was never any intent to use __future__ for experimental
features. If we want that maybe we could have from __experimental__
import .

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 9:19 AM, Guido van Rossum  wrote:

> On Fri, Aug 26, 2011 at 11:01 PM, Dan Stromberg 
> wrote:
> [Steven]
> >> Have then been any __future__ features that were added provisionally?
> >
> > I can't either, but ISTR hearing that from __future__ import was started
> > with such an intent.  Irrespective, it's hard to import something from
> > "future" without at least suspecting that you're on the bleeding edge.
>
> No, this was not the intent of __future__. The intent is that a
> feature is desirable but also backwards incompatible (e.g. introduces
> a new keyword) so that for 1 (sometimes more) releases we require the
> users to use the __future__ import.
>
> There was never any intent to use __future__ for experimental
> features. If we want that maybe we could have from __experimental__
> import .
>
> OK.  So what -is- the purpose of from __future__ import?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Antoine Pitrou
On Sun, 28 Aug 2011 01:52:51 +1000
Nick Coghlan  wrote:
> >
> > The plausible story being that we basically wrap an existing library?
> > I don't think PyPy et al have pure Python versions of the zlib or
> > OpenSSL, do they?
> >
> > If we start taking PEP 399 conformance to such levels, we might as well
> > stop developing CPython.
> 
> It's acceptable for the Python version to use ctypes in the case of
> wrapping an existing library, but the Python version should still
> exist.

I think you're taking this too seriously. Our extension modules (_bz2,
_ssl...) are *already* optional even on CPython. If the library or its
development headers are not available on the system, building these
extensions is simply skipped, and the test suite passes nonetheless.
The only required libraries for passing the tests being basically the
libc and the zlib.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Brian Curtin
On Sat, Aug 27, 2011 at 11:48, Dan Stromberg  wrote:
>
> No, this was not the intent of __future__. The intent is that a
>> feature is desirable but also backwards incompatible (e.g. introduces
>> a new keyword) so that for 1 (sometimes more) releases we require the
>> users to use the __future__ import.
>>
>> There was never any intent to use __future__ for experimental
>> features. If we want that maybe we could have from __experimental__
>> import .
>>
>> OK.  So what -is- the purpose of from __future__ import?
>

It's in the first paragraph.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
>>> PEP 399 also comes into play - we need a pure Python version for PyPy
>>> et al (or a plausible story for why an exception should be granted).

No, we don't. We can grant an exception, which I'm very willing to do.
The PEP lists wrapping a specific C-based library as a plausible reason.

> It's acceptable for the Python version to use ctypes

Hmm. To me, *that's* unacceptable. In the specific case, having a
pure-Python implementation would be acceptable to me, but I'm skeptical
that anybody is willing to produce one.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Charles-François Natali
Hi Armin,

> This is basically dangerous, because it corresponds to taking lock
> "GIL" and lock L, in that order, whereas the thread B takes lock L and
> plays around with lock "GIL" in the opposite order.  I think a
> reasonable solution to avoid deadlocks is simply not to use explicit
> locks inside "with atomic" blocks.

The problem is that many locks are actually acquired implicitely.
For example, `print` to a buffered stream will acquire the fileobject's mutex.
Also, even if the code inside the "with atomic" block doesn't directly
or indirectely acquire a lock, there's still the possibility of
asynchronous code that acquire locks being executed in the middle of
this block: for example, signal handlers are run on behalf of the main
thread from the main eval loop and in certain other places, and the GC
might kick in at any time.

> Generally speaking it can be regarded as wrong to do any action that
> causes an unbounded wait in a "with atomic" block,

Indeed.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Martin v. Löwis
Am 27.08.2011 12:10, schrieb Antoine Pitrou:
> On Sat, 27 Aug 2011 08:02:31 +0200
> "Martin v. Löwis"  wrote:
>>> I'm not sure it's worth doing an extensive review of the code, a better
>>> approach might be to require extensive test coverage  (and a review of
>>> tests).
>>
>> I think it's worth. It's really bad if only one developer fully
>> understands the regex implementation.
> 
> Could such a review be the topic of an informational PEP?

Well, the reviewer would also have to dive into the code details,
e.g. through Rietveld. Of course, referencing the Rietveld issue in
the PEP might be appropriate.

A PEP should IMO only cover end-user aspects of the new re module.
Code organization is typically not in the PEP. To give a specific
example: you mentioned that there is (near) code duplication
MRAB's module. As a reviewer, I would discuss whether this can be
eliminated - but not in the PEP.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 18:50:40 +0200
Antoine Pitrou  wrote:

> On Sun, 28 Aug 2011 01:52:51 +1000
> Nick Coghlan  wrote:
> > >
> > > The plausible story being that we basically wrap an existing library?
> > > I don't think PyPy et al have pure Python versions of the zlib or
> > > OpenSSL, do they?
> > >
> > > If we start taking PEP 399 conformance to such levels, we might as well
> > > stop developing CPython.
> > 
> > It's acceptable for the Python version to use ctypes in the case of
> > wrapping an existing library, but the Python version should still
> > exist.
> 
> I think you're taking this too seriously. Our extension modules (_bz2,
> _ssl...) are *already* optional even on CPython. If the library or its
> development headers are not available on the system, building these
> extensions is simply skipped, and the test suite passes nonetheless.
> The only required libraries for passing the tests being basically the
> libc and the zlib.

...and, apparently, pyexpat...


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 9:53 AM, Brian Curtin wrote:

> On Sat, Aug 27, 2011 at 11:48, Dan Stromberg  wrote:
>>
>> No, this was not the intent of __future__. The intent is that a
>>> feature is desirable but also backwards incompatible (e.g. introduces
>>> a new keyword) so that for 1 (sometimes more) releases we require the
>>> users to use the __future__ import.
>>>
>>> There was never any intent to use __future__ for experimental
>>> features. If we want that maybe we could have from __experimental__
>>> import .
>>>
>>> OK.  So what -is- the purpose of from __future__ import?
>>
>
> It's in the first paragraph.
>

I disagree.  The first paragraph says this has something to do with new
keywords.  It doesn't appear to say what we expect users to -do- with it.
Both are important.

Is it "You'd better try this, because it's going in eventually.  If you
don't try it out before it becomes default behavior, you have no right to
complain"?

And if people do complain, what are python-dev's options?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Virgil Dupras
On 2011-08-27, at 2:20 PM, Dan Stromberg wrote:

> 
> On Sat, Aug 27, 2011 at 9:53 AM, Brian Curtin  wrote:
> On Sat, Aug 27, 2011 at 11:48, Dan Stromberg  wrote:
> No, this was not the intent of __future__. The intent is that a
> feature is desirable but also backwards incompatible (e.g. introduces
> a new keyword) so that for 1 (sometimes more) releases we require the
> users to use the __future__ import.
> 
> There was never any intent to use __future__ for experimental
> features. If we want that maybe we could have from __experimental__
> import .
> 
> OK.  So what -is- the purpose of from __future__ import?
> 
> It's in the first paragraph. 
> 
> I disagree.  The first paragraph says this has something to do with new 
> keywords.  It doesn't appear to say what we expect users to -do- with it.  
> Both are important.
> 
> Is it "You'd better try this, because it's going in eventually.  If you don't 
> try it out before it becomes default behavior, you have no right to complain"?
> 
> And if people do complain, what are python-dev's options?
> 

__future__ imports have nothing to do with "trying stuff before it comes", it 
has to do with backward compatibility. For example, the "with_statement" was a 
__future__ import because introducing the "with" keyword would break any code 
using "with" as a token. I don't think that the goal of introducing "with" as a 
future import was "we're gonna see how it pans out, and decide if we really 
introduce it later".

__future__ means "It's coming, prepare your code".
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Martin v. Löwis
> I disagree.  The first paragraph says this has something to do with new
> keywords.  It doesn't appear to say what we expect users to -do- with
> it.  Both are important.

Well, users can use the new features...

> Is it "You'd better try this, because it's going in eventually.  If you
> don't try it out before it becomes default behavior, you have no right
> to complain"?

No. It's "we have that feature which will be activated in a future
version. If you want to use it today, use the __future__ import. If
you don't want to use it (now or in the future), just don't."

> And if people do complain, what are python-dev's options?

That will depend on the complaint. If it's "I don't like the new
feature", then the obvious response is "don't use it, then".

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Terry Reedy

On 8/27/2011 9:47 AM, Nadeem Vawda wrote:


I'd like to propose the addition of a new module in Python 3.3. The 'lzma'
module will provide support for compression and decompression using the LZMA
algorithm, and the .xz and .lzma file formats. The matter has already been
discussed on the tracker, where there seems
to be a consensus that this is a desirable feature. What are your thoughts?


As I read the discussion, the idea has been more or less accepted in 
principle. However, the current patch is not and needs changes.



The proposed module's API will be very similar to that of the bz2 module;
the only differences will be additional keyword arguments to some functions,
for specifying container formats and detailed compressor options.


I believe Antoine suggested a PEP. It should summarize the salient 
points in the long tracker discussion into a coherent exposition and 
flesh out the details implied above. (Perhaps they are already in the 
proposed doc addition.)



The implementation will also be similar to bz2 - basic compressor and
decompressor classes written in C, with convenience functions and a file
interface implemented on top of those in Python.


I would follow Martin's suggestions, including doing all i/o with the io 
module and the following:

"So I would propose that a very thin C layer is created around the C
library that focuses on the actual algorithms, and that any higher
layers (in particular file formats) are done in Python."

If we minimize the C code we add and maximize what is done in Python, 
that would maximize the ease of porting to other implementations. This 
would conform to the spirit of PEP 399.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Steven D'Aprano

Dan Stromberg wrote:

On Sat, Aug 27, 2011 at 9:53 AM, Brian Curtin wrote:


On Sat, Aug 27, 2011 at 11:48, Dan Stromberg  wrote:

No, this was not the intent of __future__. The intent is that a

feature is desirable but also backwards incompatible (e.g. introduces
a new keyword) so that for 1 (sometimes more) releases we require the
users to use the __future__ import.

There was never any intent to use __future__ for experimental
features. If we want that maybe we could have from __experimental__
import .

OK.  So what -is- the purpose of from __future__ import?

It's in the first paragraph.



I disagree.  The first paragraph says this has something to do with new
keywords.  It doesn't appear to say what we expect users to -do- with it.
Both are important.


Have you read the PEP? I found it very helpful.

http://www.python.org/dev/peps/pep-0236/

The motivation given in the first paragraph is pretty clear to me: 
__future__ is machinery added to Python to aid the transition when a 
backwards incompatible change is made.


Perhaps it needs a note stating explicitly that it is not for trying out 
new features which may or may not be added at a later date. That may 
help prevent confusion in the, er, future.



[...]

And if people do complain, what are python-dev's options?


The PEP includes a question very similar to that:


  Q: Going back to the nested_scopes example, what if release 2.2
 comes along and I still haven't changed my code?  How can I keep
 the 2.1 behavior then?

  A: By continuing to use 2.1, and not moving to 2.2 until you do
 change your code.  The purpose of future_statement is to make
 life easier for people who keep current with the latest release
 in a timely fashion.  We don't hate you if you don't, but your
 problems are much harder to solve, and somebody with those
 problems will need to write a PEP addressing them.
 future_statement is aimed at a different audience.


To me, it's quite clear: once a feature change hits __future__, it is 
already part of the language. It may be an optional part for at least 
one release, but removing it again will require the same deprecation 
process as removing any other language feature (see PEP 5 for more details).




--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 9:04 AM, Nick Coghlan  wrote:

> On Sun, Aug 28, 2011 at 1:58 AM, Nadeem Vawda 
> wrote:
> > On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan 
> wrote:
> >> It's acceptable for the Python version to use ctypes in the case of
> >> wrapping an existing library, but the Python version should still
> >> exist.
> >
> > I'm not too sure about that - PEP 399 explicitly says that using ctypes
> is
> > frowned upon, and doesn't mention anywhere that it should be used in this
> > sort of situation.
>
> Note to self: do not comment on python-dev at 2 am, as one's ability
> to read PEPs correctly apparently suffers :)
>
> Consider my comment withdrawn, you're quite right that PEP 399
> actually says this is precisely the case where an exemption is a
> reasonable idea. Although I believe it's likely that PyPy will wrap it
> with ctypes anyway :)
>

I'd like to better understand why ctypes is (sometimes) frowned upon.

Is it the brittleness?  Tendency to segfault?

If yes, is there a way of making ctypes less brittle - say, by carefully
matching it against a specific version of a .so/.dll before starting to make
heavy use of said .so/.dll?

FWIW, I have a partial implementation of a module that does xz from Python
using ctypes.  It only does in-memory compression and decompression (not
stream compression or decompression to or from a file), because that was all
I needed for my current project, but it runs on CPython 2.x, CPython 3.x,
and PyPy.  I don't think it runs on Jython, but I've not looked at that
carefully - my code falls back on subprocess if ctypes doesn't appear to be
all there.

It's at http://stromberg.dnsalias.org/svn/xz_mod/trunk/xz_mod.py
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Add from __experimental__ import bla [was: Should we move to replace re with regex?]

2011-08-27 Thread Dj Gilcrease
In the thread about replacing re with regex someone mentioned adding
to __future__ which isnt a great idea as future APIs are already
solidified, they just live there to give developer time to adapt their
code. The idea of a __experimental__ area is good for any pep's or
stliib additions that are somewhat controversial (API isnt agreed on,
code may take a while to integrate properly, developer wants some time
to hash out any edge case bugs or API clarifications that may come up
in large scale testing, etc).

__experimental__ should emit a warning on import that says anything in
here may change or be removed at any time and should not be used in
stable code.

__experimental__ features should behave the same as __future__ in that
they can add new keywords or semantics to the existing language

__experimental__ features can move directly to the stlib or builtins
if they do not add new keywords and/or are backwards compatible with
the feature they are replacing. Otherwise they move into __future__
for how ever many releases are deemed reasonable time for developers
to adapt their code.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> I'd like to better understand why ctypes is (sometimes) frowned upon.
> 
> Is it the brittleness?  Tendency to segfault?

That, and Python should work completely if ctypes is not available.

> FWIW, I have a partial implementation of a module that does xz from
> Python using ctypes.

So does it work on Sparc/Solaris? On OpenBSD? On ARM-Linux? Does it
work if the xz library is installed into /opt/sfw/xz?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
On Sat, Aug 27, 2011 at 9:47 PM, Terry Reedy  wrote:
> On 8/27/2011 9:47 AM, Nadeem Vawda wrote:
>> I'd like to propose the addition of a new module in Python 3.3. The 'lzma'
>> module will provide support for compression and decompression using the
>> LZMA
>> algorithm, and the .xz and .lzma file formats. The matter has already been
>> discussed on the tracker, where there
>> seems
>> to be a consensus that this is a desirable feature. What are your
>> thoughts?
>
> As I read the discussion, the idea has been more or less accepted in
> principle. However, the current patch is not and needs changes.

Please note that the code I'm talking about is not the same as the patches by
Per Øyvind Karlsen that are attached to the tracker issue. I have been doing
a completely new implementation of the module, specifically to address the
concerns raised by Martin and Antoine.

(As for why I haven't posted my own changes yet - I'm currently an intern at
Google, and they want me to run my code by their open-source team before
releasing it into the wild. Sorry for the delay and the confusion.)


>> The proposed module's API will be very similar to that of the bz2 module;
>> the only differences will be additional keyword arguments to some
>> functions,
>> for specifying container formats and detailed compressor options.
>
> I believe Antoine suggested a PEP. It should summarize the salient points in
> the long tracker discussion into a coherent exposition and flesh out the
> details implied above. (Perhaps they are already in the proposed doc
> addition.)

I talked to Antoine about this on IRC; he didn't seem to think a PEP would be
necessary. But a summary of the discussion on the tracker issue might still
be a useful thing to have, given how long it's gotten.


>> The implementation will also be similar to bz2 - basic compressor and
>> decompressor classes written in C, with convenience functions and a file
>> interface implemented on top of those in Python.
>
> I would follow Martin's suggestions, including doing all i/o with the io
> module and the following:
> "So I would propose that a very thin C layer is created around the C
> library that focuses on the actual algorithms, and that any higher
> layers (in particular file formats) are done in Python."
>
> If we minimize the C code we add and maximize what is done in Python, that
> would maximize the ease of porting to other implementations. This would
> conform to the spirit of PEP 399.

As stated in my earlier response to Martin, I intend to do this. Aside from
I/O, though, there's not much that _can_ be done in Python - the rest is
basically just providing a thin wrapper for the C library.


On Sat, Aug 27, 2011 at 9:58 PM, Dan Stromberg  wrote:
> I'd like to better understand why ctypes is (sometimes) frowned upon.
>
> Is it the brittleness?  Tendency to segfault?

The problem (as I understand it) is that ABI changes in a library will
cause code that uses it via ctypes to break without warning. With an
extension module, you'll get a compile failure if you rely on things
that change in an incompatible way. With a ctypes wrapper, you just get
incorrect answers, or segfaults.


> If yes, is there a way of making ctypes less brittle - say, by
> carefully matching it against a specific version of a .so/.dll before
> starting to make heavy use of said .so/.dll?

This might be feasible for a specific application running in a controlled
environment, but it seems impractical for something as widely-used as the
stdlib. Having to include a whitelist of acceptable library versions would
be a substantial maintenance burden, and (compatible) new versions would
not work until the library whitelist gets updated.


Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 1:21 PM, "Martin v. Löwis" wrote:

> > I'd like to better understand why ctypes is (sometimes) frowned upon.
> >
> > Is it the brittleness?  Tendency to segfault?
>
> That, and Python should work completely if ctypes is not available.
>

What are the most major platforms ctypes doesn't work on?

It seems like there should be some way of coming up with an xml file
describing the types of the various bits of data and formal arguments -
perhaps using gccxml or something like it.

> FWIW, I have a partial implementation of a module that does xz from
> > Python using ctypes.
>
> So does it work on Sparc/Solaris? On OpenBSD? On ARM-Linux? Does it
> work if the xz library is installed into /opt/sfw/xz?
>

So far, I've only tried it on a couple of Linuxes and Cygwin.  I intend to
try it on a large number of *ix variants in the future, including OS/X and
Haiku.  I doubt I'll test OpenBSD, but I'm likely to test on FreeBSD and
Dragonfly again.

With regard to /opt/sfw/xz, if ctypes.util.find_library(library) is smart
enough to look there, then yes, xz_mod should find libxz there.

On Cygwin, ctypes.util.find_library() wasn't smart enough to find a Cygwin
DLL, so I coded around that.  But it finds the library OK on the Linuxes
I've tried so far.

(This is part of a larger project, a backup program.  The backup program has
been tested on a large number of OS's, but I've not done another broad round
of testing yet since adding the ctypes+xz code)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add from __experimental__ import bla [was: Should we move to replace re with regex?]

2011-08-27 Thread Victor Stinner
Le samedi 27 août 2011 21:57:26, Dj Gilcrease a écrit :
> The idea of a __experimental__ area is good for any pep's or
> stliib additions that are somewhat controversial (API isnt agreed on,
> code may take a while to integrate properly, developer wants some time
> to hash out any edge case bugs or API clarifications that may come up
> in large scale testing, etc).

__experimental__ does already exist, it's the Python Package Index (PyPI) !

http://pypi.python.org/pypi

You can write Python extensions in C and distribute them on the PyPI. I did 
that when my patch to display the Python backtrace on a crash was "rejected" 
(not included in Python 3.2, just before the release). It was a great idea, 
because I had more time to change the API (read the history of the 
faulthandler module on PyPI: the API changed 5 times since the first public 
version on PyPI...) and the module is now available for Python 2.5 - 3.2, not 
only for Python 3.3.

Remember that the API of a module added to CPython is frozen. You will have to 
wait something like 18 months until the next CPython release to change 
anything (add a new function, remove an old/useless function, etc.). 
Seriously, it's not a good idea to add a young module into Python before its 
API is well defined and stable.

The Linux kernel has "staging" drivers. It's different because there is a new 
release of the Linux kernel each two months (instead of 18 months for 
CPython). The policy for the API is also different: the kernel has no stable 
API, whereas the Python API cannot be changed in minor release (x.y.Z).

http://www.kroah.com/log/linux/stable_api_nonsense.html
http://www.mjmwired.net/kernel/Documentation/stable_api_nonsense.txt

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add from __experimental__ import bla [was: Should we move to replace re with regex?]

2011-08-27 Thread exarkun

On 07:57 pm, digitalx...@gmail.com wrote:

In the thread about replacing re with regex someone mentioned adding
to __future__ which isnt a great idea as future APIs are already
solidified, they just live there to give developer time to adapt their
code. The idea of a __experimental__ area is good for any pep's or
stliib additions that are somewhat controversial (API isnt agreed on,
code may take a while to integrate properly, developer wants some time
to hash out any edge case bugs or API clarifications that may come up
in large scale testing, etc).

__experimental__ should emit a warning on import that says anything in
here may change or be removed at any time and should not be used in
stable code.

__experimental__ features should behave the same as __future__ in that
they can add new keywords or semantics to the existing language

__experimental__ features can move directly to the stlib or builtins
if they do not add new keywords and/or are backwards compatible with
the feature they are replacing. Otherwise they move into __future__
for how ever many releases are deemed reasonable time for developers
to adapt their code.


Hi Dj,

As a developer of Python libraries and applications, I don't see how 
this would make my life easier.


A warning in a module docstring that a module may not be long-lived if 
it is not well received tells me just as much as a warning emitted at 
runtime.  And a warning emitted at runtime is likely to scare my users 
into thinking something is broken, leading to spurious or misleading bug 
reports.  There also does not appear to be general consensus that 
modules should be added to stdlib if they are not widely used and 
demanded, so I don't know when a module would be added to 
__experimental__, anyway.  The normal deprecation procedures (rarely 
used as they are) seem to cover this, anyway.


Adding a new namespace separate from __future__ also just gives me 
another thing to remember.  Was the feature added to __experimental__ or 
__future__?  Also, it seems even less common that language features are 
added on an experimental basis.  When a language feature (new syntax or 
semantics) goes in to the language, it is there for a long, long time.


If new features are added first to __experimental__ and then to 
__future__ or the non-__experimental__ stdlib namespace, then I just 
have to update all my code to keep using it.  So I'm guaranteed extra 
work whether the feature is successful and is adopted or if it fails and 
is later removed.  I'd rather not have to do the extra work in the 
success case, at least, which is what the existing add-it-and-then-maybe 
-(but-probably-not-)deprecate it approach gives me.


Jean-Paul

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python- 
dev/exarkun%40twistedmatrix.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Nadeem Vawda
On Sat, Aug 27, 2011 at 10:41 PM, Dan Stromberg  wrote:
> It seems like there should be some way of coming up with an xml file
> describing the types of the various bits of data and formal arguments -
> perhaps using gccxml or something like it.

The problem is that you would need to do this check at runtime, every time
you load up the library - otherwise, what happens if the user upgrades
their installed copy of liblzma? And we can't expect users to have the
liblzma headers installed, so we'd have to try and figure out whether the
library was ABI-compatible from the shared object alone; I doubt that this
is even possible.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 2:38 PM, Nadeem Vawda wrote:

> On Sat, Aug 27, 2011 at 10:41 PM, Dan Stromberg 
> wrote:
> > It seems like there should be some way of coming up with an xml file
> > describing the types of the various bits of data and formal arguments -
> > perhaps using gccxml or something like it.
>
> The problem is that you would need to do this check at runtime, every time
> you load up the library - otherwise, what happens if the user upgrades
> their installed copy of liblzma? And we can't expect users to have the
> liblzma headers installed, so we'd have to try and figure out whether the
> library was ABI-compatible from the shared object alone; I doubt that this
> is even possible.
>

I was thinking about this as I was getting groceries a bit ago.

Why -can't- we expect the user to have liblzma headers installed?  Couldn't
it just be a dependency in the package management system?

BTW, gcc-xml seems to be only for C++ (?), but long ago, around the time
people were switching from K&R to Ansi C, there were programs like
"mkptypes" that could parse a .c/.h and output prototypes.  It seems we
could do something like this on module init.

IMO, we really, really need some common way of accessing C libraries that
works for all major Python variants.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 15:14:15 -0700
Dan Stromberg  wrote:

> On Sat, Aug 27, 2011 at 2:38 PM, Nadeem Vawda wrote:
> 
> > On Sat, Aug 27, 2011 at 10:41 PM, Dan Stromberg 
> > wrote:
> > > It seems like there should be some way of coming up with an xml file
> > > describing the types of the various bits of data and formal arguments -
> > > perhaps using gccxml or something like it.
> >
> > The problem is that you would need to do this check at runtime, every time
> > you load up the library - otherwise, what happens if the user upgrades
> > their installed copy of liblzma? And we can't expect users to have the
> > liblzma headers installed, so we'd have to try and figure out whether the
> > library was ABI-compatible from the shared object alone; I doubt that this
> > is even possible.
> >
> 
> I was thinking about this as I was getting groceries a bit ago.
> 
> Why -can't- we expect the user to have liblzma headers installed?  Couldn't
> it just be a dependency in the package management system?

Package managers, under Linux, often split development files (headers,
etc.) from runtime binaries.
Also, under Windows, most users don't have development stuff installed
at all.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> Why -can't- we expect the user to have liblzma headers installed? 
> Couldn't it just be a dependency in the package management system?

Please give it up. You just won't convince that list that ctypes
is a viable approach for the standard library.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 3:26 PM, Antoine Pitrou  wrote:

> On Sat, 27 Aug 2011 15:14:15 -0700
> Dan Stromberg  wrote:
>
> > On Sat, Aug 27, 2011 at 2:38 PM, Nadeem Vawda  >wrote:
> >
> > > On Sat, Aug 27, 2011 at 10:41 PM, Dan Stromberg 
> > > wrote:
> > > > It seems like there should be some way of coming up with an xml file
> > > > describing the types of the various bits of data and formal arguments
> -
> > > > perhaps using gccxml or something like it.
> > >
> > > The problem is that you would need to do this check at runtime, every
> time
> > > you load up the library - otherwise, what happens if the user upgrades
> > > their installed copy of liblzma? And we can't expect users to have the
> > > liblzma headers installed, so we'd have to try and figure out whether
> the
> > > library was ABI-compatible from the shared object alone; I doubt that
> this
> > > is even possible.
> > >
> >
> > I was thinking about this as I was getting groceries a bit ago.
> >
> > Why -can't- we expect the user to have liblzma headers installed?
>  Couldn't
> > it just be a dependency in the package management system?
>
> Package managers, under Linux, often split development files (headers,
> etc.) from runtime binaries.
>

Well, uh, yeah.  Not sure what your point is.
1) We could easily work with the dev / nondev distinction by taking a
dependency on the -dev version of whatever we need, instead of the nondev
version.
2) It's a rather arbitrary distinction that's being drawn between dev and
nondev today.  There's no particular reason why the line couldn't be drawn
somewhere else.


> Also, under Windows, most users don't have development stuff installed
> at all.
>
Yes...  But if the nature of "what development stuff is" were to change,
they'd have different stuff.

Also, we wouldn't have to parse the .h's every time a module is loaded - we
could have a timestamp file (or database) indicating when we last parsed a
given .h.

Also, we could query the package management system for the version of lzma
that's currently installed on module init.

Also, we could include our own version of lzma.  Granted, this was a mess
when zlib needed to be patched, but even this one might be worth it for the
improved library unification across Python implementations.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Antoine Pitrou
On Sat, 27 Aug 2011 16:19:01 -0700
Dan Stromberg  wrote:
> 2) It's a rather arbitrary distinction that's being drawn between dev and
> nondev today.  There's no particular reason why the line couldn't be drawn
> somewhere else.

Sure. Now please convince Linux distributions first, because this
particular subthread is going nowhere.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Greg Ewing

Nick Coghlan wrote:


The next step needed is for someone to volunteer to write and champion
a PEP that:


Would it be feasible and desirable to modify regex so
that it *is* backwards-compatible with re, with a view
to making it a drop-in replacement at some point?

If not, the PEP should discuss this also.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Terry Reedy

On 8/27/2011 7:39 PM, Greg Ewing wrote:

Nick Coghlan wrote:


The next step needed is for someone to volunteer to write and champion
a PEP that:


Would it be feasible and desirable to modify regex so
that it *is* backwards-compatible with re, with a view
to making it a drop-in replacement at some point?

If not, the PEP should discuss this also.


Many of the things regex does differently might be called either bug 
fixes or feature changes, depending on one's viewpoint. Regex should 
definitely not be 'bug-compatible'.


I think regex should be unicode-standard compliant as much as possible, 
and let the chips fall where they may. If so, it would be like the 
decimal module, which closely tracks the IEEE decimal standard, rather 
than the binary float standard. Regex is already much more compliant 
than re, as shown by Tom Christiansen. This is pretty obviously 
intentional on MB's part. It is also probably intentional that re *not* 
match today's Unicode TR18 specifications.


These are reasons why both Ezio and I suggested on the tracker adding 
regex without deleting re. (I personally would not mind just replacing 
re with regex, but then I have no legacy re code to break. So I am not 
suggesting that out of respect for those who do.)


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 4:27 PM, Antoine Pitrou  wrote:

>
> Sure. Now please convince Linux distributions first, because this
> particular subthread is going nowhere.
>

I hope you're not a solipsist.

Anyway, if the mere -discussion- of embracing a standard and safe way of
making C libraries callable from all the major Python implementations is
"going nowhere" before the discussion has even gotten started, I fear for
Python's future.

Repeat aloud to yourself: Python != CPython.  Python != CPython.  Python !=
CPython.

Has this topic been discussed to death?  If so, then say so.  It's rude to
try to kill the thread summarily before it gets started, sans discussion,
sans explanation, sans commentary on whether new additions to the topic have
surfaced or not.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 4:27 PM, Antoine Pitrou  wrote:

> On Sat, 27 Aug 2011 16:19:01 -0700
> Dan Stromberg  wrote:
> > 2) It's a rather arbitrary distinction that's being drawn between dev and
> > nondev today.  There's no particular reason why the line couldn't be
> drawn
> > somewhere else.
>
> Sure. Now please convince Linux distributions first, because this
> particular subthread is going nowhere.
>

Interesting.  You seem to want to throw an arbitrary barrier between Python,
the language, and accomplishing something important for said language.

Care to tell me why I'm wrong?  I'm all ears.

I'll note that you've deleted:

> 1) We could easily work with the dev / nondev distinction by
> taking a dependency on the -dev version of whatever we need,
> instead of the nondev version.

...which makes it more than apparent that we needn't convince Linux
distributors of #2, which you seem to prefer to focus on.

Why was it in your best interest to delete #1, without even commenting on
it?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Ezio Melotti
On Sun, Aug 28, 2011 at 3:48 AM, Terry Reedy  wrote:

>
> These are reasons why both Ezio and I suggested on the tracker adding regex
> without deleting re. (I personally would not mind just replacing re with
> regex, but then I have no legacy re code to break. So I am not suggesting
> that out of respect for those who do.)
>

I would actually prefer to replace re.

Before doing that we should make a list of all the differences between the
two modules (possibly in the PEP).  On the regex page on PyPI there's
already a list that can be used for this purpose [0].
For bug fixes it *shouldn't* be a problem if the behavior changes.  New
features shouldn't bring any backward-incompatible behavioral changes, and,
as far as I understand, Matthew introduced the NEW flag [1], to avoid
problems when they do.

I think re should be kept around only if there are too many
incompatibilities left and if they can't be fixed in regex.

Best Regards,
Ezio Melotti


[0]: http://pypi.python.org/pypi/regex/0.1.20110717
[1]: "The NEW flag turns on the new behaviour of this module, which can
differ from that of the 're' module, such as splitting on zero-width
matches, inline flags affecting only what follows, and being able to turn
inline flags off."
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 5:48 PM, Terry Reedy  wrote:
> Many of the things regex does differently might be called either bug fixes
> or feature changes, depending on one's viewpoint. Regex should definitely
> not be 'bug-compatible'.

Well, as you said, it depends on one's viewpoint. If there's a bug in
the treatment of non-BMP character ranges, that's a bug, and fixing it
shouldn't break anybody's code (unless it was worth breaking :-). But
if there's a change that e.g. (hypothetical example) makes a different
choice about how empty matches are treated in some edge case, and the
old behavior was properly documented, that's a feature change, and I'd
rather introduce a flag to select the new behavior (or, if we have to,
a flag to preserve the old behavior, if the new behavior is really
considered much better and much more useful).

> I think regex should be unicode-standard compliant as much as possible, and
> let the chips fall where they may.

In most cases the Unicode improvements in regex are not where it is
incompatible; e.g. adding \X and named ranges are fine new additions
and IIUC the syntax was carefully designed not to introduce any
incompatibilities (within the limitations of \-escapes).

It's the many other "improvements" to the regex module that sometimes
make it incompatible.There's a comprehensive list here:
http://pypi.python.org/pypi/regex . Somebody should just go over it
and for each difference make a recommendation for whether to treat
this as a bugfix, a compatible new feature, or an incompatibility that
requires some kind of flag. (We could have a single flag for all
incompatibilities, or several flags.)

> If so, it would be like the decimal
> module, which closely tracks the IEEE decimal standard, rather than the
> binary float standard.

Well, I would hope that for each "major" Python version (i.e. 3.2,
3.3, 3.4, ...) we would pick a specific version of the Unicode
standard and declare our desire to be compliant with that Unicode
standard version, and not switch allegiances in some bugfix version
(e.g. 3.2.3, 3.3.1, ...).

> Regex is already much more compliant than re, as shown by Tom Christiansen.

Nobody disagrees with this or thinks it's a bad thing. :-)

> This is pretty obviously intentional on MB's part.

That's also clear.

> It is also probably intentional that re *not* match today's Unicode
> TR18 specifications.

That I'm not so sure of. I think it's more the case that TR18 evolved
and that the re modules didn't -- probably mostly because nobody had
the time and nobody was aware of the TR18 changes.

> These are reasons why both Ezio and I suggested on the tracker adding regex
> without deleting re. (I personally would not mind just replacing re with
> regex, but then I have no legacy re code to break. So I am not suggesting
> that out of respect for those who do.)

That option is definitely still on the table. At the very least a
thorough review of the stated differences between re and regex should
be done -- I trust that MR has been very thorough in his listing of
those differences. The issues regarding maintenance and stability of
MR's code can be solved in a number of ways -- if MR doesn't mind I
would certainly be willing to give him core committer access (though
I'd still recommend that he use his time primarily to train others in
maintaining this important code base).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg  wrote:
> IMO, we really, really need some common way of accessing C libraries that
> works for all major Python variants.

We have one. It's called writing an extension module.

ctypes is a crutch because it doesn't realistically have access to the
header files. It's a fine crutch for PyPy, which doesn't have much of
an alternative. It's also a fine crutch for people who need something
to run *now*. It's a horrible strategy for the standard library.

If you have a better proposal please do write it up. But so far you
are mostly exposing your ignorance and insisting dramatically that you
be educated.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Ezio Melotti
On Sat, Aug 27, 2011 at 4:56 AM, Antoine Pitrou  wrote:

> On Sat, 27 Aug 2011 04:37:21 +0300
> Ezio Melotti  wrote:
> >
> > I'm not sure it's worth doing an extensive review of the code, a better
> > approach might be to require extensive test coverage  (and a review of
> > tests).  If the code seems well written, commented, documented (I think
> > proper rst documentation is still missing),
>
> Isn't this precisely what a review is supposed to assess?
>

This can be done without actually knowing and understanding every single
function in the module (I got the impression that someone wants this kind of
review, correct me if I'm wrong).


>
> > We will get familiar with the code once we start contributing
> > to it and fixing bugs, as it already happens with most of the other
> modules.
>
> I'm not sure it's a good idea for a module with more than 1 lines
> of C code (and 4000 lines of pure Python code). This is several times
> the size of multiprocessing. The C code looks very cleanly written, but
> it's still a big chunk of algorithmically sophisticated code.
>

Even unicodeobject.c is 10k+ lines of C code and I got familiar with (parts
of) it just by fixing bugs in specific functions.
I took a look at the regex code and it seems clear, with enough comments and
several small functions that are easy to follow and understand.
multiprocessing requires good knowledge of a number of concepts and
platform-specific issues that makes it more difficult to understand and
maintain (but maybe regex-related concepts seems easier to me because I'm
already familiar with them).

I think it would be good to:
  1) have some document that explains the general design and main (internal)
functions of the module (e.g. a PEP);
  2) make a review on rietveld (possibly only of the diff with re, to limit
the review to the new code only), so that people can ask questions, discuss
and understand the code;
  3) possibly update the document/PEP with the outcome of the rietveld
review(s) and/or address the issues discussed (if any);
  4) add documentation for the module and the (public) functions in
Doc/library (this should be done anyway).

This will ensure that the general quality of the code is good, and when
someone actually has to work on the code, there's enough documentation to
make it possible.

Best Regards,
Ezio Melotti


>
> Another "interesting" question is whether it's easy to port to the PEP
> 393 string representation, if it gets accepted.
>
> Regards
>
> Antoine.
>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 8:59 PM, Ezio Melotti  wrote:
> On Sat, Aug 27, 2011 at 4:56 AM, Antoine Pitrou  wrote:
>>
>> On Sat, 27 Aug 2011 04:37:21 +0300
>> Ezio Melotti  wrote:
>> >
>> > I'm not sure it's worth doing an extensive review of the code, a better
>> > approach might be to require extensive test coverage  (and a review of
>> > tests).  If the code seems well written, commented, documented (I think
>> > proper rst documentation is still missing),
>>
>> Isn't this precisely what a review is supposed to assess?
>
> This can be done without actually knowing and understanding every single
> function in the module (I got the impression that someone wants this kind of
> review, correct me if I'm wrong).

Wasn't me. I've long given up expecting to understand every line of
code in CPython. I'm happy if the code is written in a way that makes
it possible to read and understand it as the need arises.

>> > We will get familiar with the code once we start contributing
>> > to it and fixing bugs, as it already happens with most of the other
>> > modules.
>>
>> I'm not sure it's a good idea for a module with more than 1 lines
>> of C code (and 4000 lines of pure Python code). This is several times
>> the size of multiprocessing. The C code looks very cleanly written, but
>> it's still a big chunk of algorithmically sophisticated code.
>
> Even unicodeobject.c is 10k+ lines of C code and I got familiar with (parts
> of) it just by fixing bugs in specific functions.
> I took a look at the regex code and it seems clear, with enough comments and
> several small functions that are easy to follow and understand.
> multiprocessing requires good knowledge of a number of concepts and
> platform-specific issues that makes it more difficult to understand and
> maintain (but maybe regex-related concepts seems easier to me because I'm
> already familiar with them).

Are you volunteering? (Even if you don't want to be the only
maintainer, it still sounds like you'd be a good co-maintainer of the
regex module.)

> I think it would be good to:
>   1) have some document that explains the general design and main (internal)
> functions of the module (e.g. a PEP);

I don't think that such a document needs to be a PEP; PEPs are usually
intended where there is significant discussion expected, not just to
explain things. A README file or a Wiki page would be fine, as long as
it's sufficiently comprehensive.

>   2) make a review on rietveld (possibly only of the diff with re, to limit
> the review to the new code only), so that people can ask questions, discuss
> and understand the code;

That would be an interesting exercise indeed.

>   3) possibly update the document/PEP with the outcome of the rietveld
> review(s) and/or address the issues discussed (if any);

Yeah, of course.

>   4) add documentation for the module and the (public) functions in
> Doc/library (this should be done anyway).

Does regex have a significany public C interface? (_sre.c doesn't.)
Does it have a Python-level interface beyond what re.py offers (apart
from the obvious new flags and new regex syntax/semantics)?

> This will ensure that the general quality of the code is good, and when
> someone actually has to work on the code, there's enough documentation to
> make it possible.

That sounds like a good description of a process that could lead to
acceptance of regex as a re replacement.

>> Another "interesting" question is whether it's easy to port to the PEP
>> 393 string representation, if it gets accepted.

It's very likely that PEP 393 is accepted. So likely, in fact, that I
would recommend that you start porting regex to PEP 393 now. The
experience would benefit both your understanding of the regex module
and the quality of the PEP and its implementation.

I like what I hear here!

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-27 Thread Terry Reedy
Dan, I once had the more or less the same opinion/question as you with 
regard to ctypes, but I now see at least 3 problems.


1) It seems hard to write it correctly. There are currently 47 open 
ctypes issues, with 9 being feature requests, leaving 38 
behavior-related issues. Tom Heller has not been able to work on it 
since the beginning of 2010 and has formally withdrawn as maintainer. No 
one else that I know of has taken his place.


2) It is not trivial to use it correctly. I think it needs a SWIG-like 
companion script that can write at least first-pass ctypes code from the 
.h header files. Or maybe it could/should use header info at runtime 
(with the .h bundled with a module).


3) It seems to be slower than compiled C extension wrappers. That, at 
least, was the discovery of someone who re-wrote pygame using ctypes. 
(The hope was that using ctypes would aid porting to 3.x, but the time 
penalty was apparently too much for time-critical code.)


If you want to see more use of ctypes in the Python community (though 
not necessarily immediately in the stdlib), feel free to work on any one 
of these problems.


A fourth problem is that people capable of working on ctypes are also 
capable of writing C extensions, and most prefer that. Or some work on 
Cython, which is a third solution.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-27 Thread Terry Reedy

On 8/27/2011 11:54 PM, Guido van Rossum wrote:


If so, it would be like the decimal
module, which closely tracks the IEEE decimal standard, rather than the
binary float standard.


Well, I would hope that for each "major" Python version (i.e. 3.2,
3.3, 3.4, ...) we would pick a specific version of the Unicode
standard and declare our desire to be compliant with that Unicode
standard version, and not switch allegiances in some bugfix version
(e.g. 3.2.3, 3.3.1, ...).


Definitely. The unicode version would have to be frozen with beta 1 if 
not before. (I am quite sure the decimal module also freezes the IEEE 
standard version *it* follows for each Python version.)


In my view, x.y is a version of the Python language while the x.y.z 
CPython releases are progressively better implementations of that one 
language, starting with x.y.0. This is the main reason I suggested that 
the first CPython release for the 3.3 language be called 3.3.0, as it 
now is. In this view, there is no question of an x.y.z+1 release 
changing the definition of the x.y language.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Dan Stromberg
On Sat, Aug 27, 2011 at 8:57 PM, Guido van Rossum  wrote:

> On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg 
> wrote:
> > IMO, we really, really need some common way of accessing C libraries that
> > works for all major Python variants.
>
> We have one. It's called writing an extension module.
>

And yet Cext's are full of CPython-isms.

I've said in the past that Python has been lucky in that it had only a
single implementation for a long time, but still managed to escape becoming
too defined by the idiosyncrasies of that implementation - that's quite
impressive, and is probably our best indication that Python has had
leadership with foresight.  In the language proper, I'd say I still believe
this, but Cext's are sadly not a good example.


> ctypes is a crutch because it doesn't realistically have access to the
> header files.


Well, actually, header files are pretty easy to come by.  I bet you've
installed them yourself many times.  In fact, you've probably even
automatically brought some of them in via a package management system of one
form or another without getting your hands dirty.

As a thought experiment, imagine having a ctypes configuration system that
looks around a computer for .h's and .so's (etc) with even 25% of the effort
expended by GNU autoconf.  Instead of building the results into a bunch of
.o's, the results are saved in a .ct file or something.  If you build-in
some reasonable default locations to look in, plus the equivalent of some
-I's and -L's (and maybe -rpath's) as needed, you probably end up with a
pretty comparable system.

(typedef's might be a harder problem - that's particularly worth discussing,
IMO - your chance to nip this in the bud with a reasoned explanation why
they can't be handled well!)

It's a fine crutch for PyPy, which doesn't have much of
> an alternative.


Wait - a second ago I thought I was to believe that C extension modules were
the one true way of interfacing with C code across all major
implementations?  Are we perhaps saying that CPython is "the" major
implementation, and that we want it to stay that way?

I personally feel that PyPy has arrived as a major implementation.  The
backup program I've been writing in my spare time runs great on PyPy (and
the CPython's from 2.5.x, and pretty well on Jython).  And PyPy has been
maturing very rapidly ('just wish they'd do 3.x!).

It's also a fine crutch for people who need something
> to run *now*. It's a horrible strategy for the standard library.
>

I guess I'm coming to see this as dogma.

If ctypes is augmented with type information and/or version information and
where to find things, wouldn't it Become safe and convenient?  Or do you
have other concerns?

Make a list of things that can go wrong with ctypes modules.  Now make a
list of things that can wrong with C extension modules.  Aren't they really
pretty similar - missing .so, .so in a weird place, and especially: .so with
a changed interface?  C really isn't a very safe language - not like
http://en.wikipedia.org/wiki/Turing_%28programming_language%29 or
something.  Perhaps it's a little easier to mess things up with ctypes today
(a recompile doesn't fix, or at least detect, as many problems), but isn't
it at least worth Thinking about how that situation could be improved?

If you have a better proposal please do write it up. But so far you
> are mostly exposing your ignorance and insisting dramatically that you
> be educated.
>

I'm not sure why you're trying to avoid having a discussion.  I think it's
premature to dive into a proposal before getting other people's thoughts.
Frankly, 100 people tend to think better than one - at least, if the 100
people feel like they can talk.

I'm -not- convinced ctypes are the way forward.  I just want to talk about
it - for now.  ctypes have some significant advantages - if we can find a
way to eliminate and/or ameliorate their disadvantages, they might be quite
a bit nicer than Cext's.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Martin v. Löwis
> I just want to talk about it - for now.

python-ideas is a better place to just talk than python-dev.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-27 Thread Stefan Behnel

Dan Stromberg, 27.08.2011 21:58:

On Sat, Aug 27, 2011 at 9:04 AM, Nick Coghlan wrote:

On Sun, Aug 28, 2011 at 1:58 AM, Nadeem Vawda wrote:

On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan wrote:

It's acceptable for the Python version to use ctypes in the case of
wrapping an existing library, but the Python version should still
exist.


I'm not too sure about that - PEP 399 explicitly says that using ctypes
is
frowned upon, and doesn't mention anywhere that it should be used in this
sort of situation.


Note to self: do not comment on python-dev at 2 am, as one's ability
to read PEPs correctly apparently suffers :)

Consider my comment withdrawn, you're quite right that PEP 399
actually says this is precisely the case where an exemption is a
reasonable idea. Although I believe it's likely that PyPy will wrap it
with ctypes anyway :)


I'd like to better understand why ctypes is (sometimes) frowned upon.

Is it the brittleness?  Tendency to segfault?


Maybe unwieldy code and slow execution on CPython?

Note that there's a ctypes backend for Cython being written as part of a 
GSoC, so it should eventually become possible to write C library wrappers 
in Cython and have it generate a ctypes version to run on PyPy. That, 
together with the IronPython backend that is on its way, would give you a 
way to write fast wrappers for at least three of the major four Python 
implementations, without sacrificing readability or speed in one of them.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com