Re: [Python-Dev] Adding NewType() to PEP 484

2016-05-28 Thread André Malo
* Steven D'Aprano wrote:

> On Fri, May 27, 2016 at 04:01:11PM -0700, Guido van Rossum wrote:
> > Also -- the most important thing. :-) What to call these things? We're
> > pretty much settled on the semantics and how to create them (A =
> > NewType('A', int)) but what should we call types like A when we're
> > talking about them? "New types" sounds awkward.
>
> TypeAlias? Because A is an alias for int?

I like the view C takes on this: typedef.

Cheers,
-- 
Wer sein Wissen nicht teilen will, besitzt wahrscheinlich zu wenig davon.
  -- Unbekannt
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-12 Thread André Malo
* INADA Naoki wrote:

> Is there any real application which marshal.dumps() performance is
> critical?

I'm using it for spooling big chunks of data on disk, exactly for the reason 
that it's faster than pickle.

Cheers,
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can I make marshal.dumps() slower but stabler?

2018-07-13 Thread André Malo
On Donnerstag, 12. Juli 2018 22:09:41 CEST Antoine Pitrou wrote:
> On Thu, 12 Jul 2018 22:03:30 +0200
> 
> André Malo  wrote:
> > * INADA Naoki wrote:
> > > Is there any real application which marshal.dumps() performance is
> > > critical?
> > 
> > I'm using it for spooling big chunks of data on disk, exactly for the
> > reason that it's faster than pickle.
> 
> Which kind of data is that?

Basically iterators of builtin objects (dicts or tuples of strings or 
numbers). Typically one unit or "row" per dumps() call (they are written one 
after the next and marshal load can easily load them in the same manner).
They're certainly never the same objects (except maybe for dict keys, which 
might be interned)

Cheers,
nd


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread André Malo
Victor Stinner wrote:

> Replacing macros with functions has little impact on backward
> compatibility. Most C extensions should still work if macros become
> functions.

As long as they are recompiled. However, they will lose a lot of performance. 
Both these points have been mentioned somewhere, I'm certain, but it cannot be 
stressed enough, IMHO.

> 
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.

I'm wondering, how you suggest to measure "major". I believe, every C 
extension, which is public and running in production somewhere, is major 
enough.

Maybe "easiness to fix"? Lines of code?

Cheers,
-- 
> Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat...

[...] Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und
Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-13 Thread André Malo
Victor Stinner wrote:

> Replacing macros with functions has little impact on backward
> compatibility. Most C extensions should still work if macros become
> functions.

As long as they are recompiled. However, they will lose a lot of performance. 
Both these points have been mentioned somewhere, I'm certain, but it cannot be 
stressed enough, IMHO.

> 
> I'm not sure yet how far we should go towards a perfect API which
> doesn't leak everything. We have to move slowly, and make sure that we
> don't break major C extensions. We need to write tools to fully
> automate the conversion. If it's not possible, maybe the whole project
> will fail.

I'm wondering, how you suggest to measure "major". I believe, every C 
extension, which is public and running in production somewhere, is major 
enough.

Maybe "easiness to fix"? Lines of code?

Cheers,
-- 
> Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat...

Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl[...] 
Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-14 Thread André Malo
On Dienstag, 13. November 2018 21:59:14 CET Victor Stinner wrote:
> Le mar. 13 nov. 2018 à 20:32, André Malo  a écrit :
> > As long as they are recompiled. However, they will lose a lot of
> > performance. Both these points have been mentioned somewhere, I'm
> > certain, but it cannot be stressed enough, IMHO.
> 
> Somewhere is here:
> https://pythoncapi.readthedocs.io/performance.html

> > I'm wondering, how you suggest to measure "major". I believe, every C
> > extension, which is public and running in production somewhere, is major
> > enough.
> 
> My plan is to select something like the top five most popular C
> extensions based on PyPI download statistics. I cannot test
> everything, I have to put practical limits.

You shouldn't. Chances are, that you don't even know them enough to do that. 
A scalable approach would be to talk to the projects and let them do it 
instead. No?

Cheers,
-- 
package Hacker::Perl::Another::Just;print
qq~@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://www.perlig.de  #


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread André Malo
On Montag, 20. Mai 2019 23:27:49 CEST Antoine Pitrou wrote:
> NNTP is still quite used (often through GMane, but probably not only) so
> I'd question the removal of nntplib.
> 
> cgitb used to be used by some Web frameworks in order to format
> exceptions.  Perhaps one should check if that's still the case.

I concur with both of those.
There's software in production using both. (It doesn't mean it's on pypi or 
even free software).

What would be the maintenance burden of those modules anyway? (at least for 
nntp, I guess it's not gonna change).

nd
-- 
package Hacker::Perl::Another::Just;print
qq~@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://pub.perlig.de  #


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread André Malo
On Dienstag, 21. Mai 2019 13:24:34 CEST Victor Stinner wrote:
> Le mar. 21 mai 2019 à 13:18, André Malo  a écrit :
> > There's software in production using both. (It doesn't mean it's on pypi
> > or even free software).
> > 
> > What would be the maintenance burden of those modules anyway? (at least
> > for nntp, I guess it's not gonna change).
> 
> The maintenance burden is real even if it's not visible. For example,
> test_nntplib is causing frequently issues on our CI:
> 
> https://bugs.python.org/issue19756
> https://bugs.python.org/issue19613
> https://bugs.python.org/issue31850
> 
> It's failing frequently since 2013, and nobody managed to come with a
> fix.. in 6 years.
> 
> There are 11 open issues with "nntp" in their title (8 with exactly
> "nntplib" in their title).
> 
> test_nntplib uses the public server news.trigofacile.com which is
> operated by Julien ÉLIE. Two years ago, Julien asked me if there is
> any plan to support the NNTP "COMPRESS" command.

So what I hear is, this battery is definitely not dead, which is what the 
PEP is all about.
it's just half charged (or discharged, depending on your POV), so to speak.

Substitute: "none" should read pypi then?

nd
-- 
"Solides und umfangreiches Buch"
  -- aus einer Rezension

<http://pub.perlig.de/books.html#apache2>


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread André Malo
On Dienstag, 21. Mai 2019 13:50:30 CEST Victor Stinner wrote:

> Well, that makes sense. But so, what is the metric to decide if a
> module is "widely used" or not?

Yes. Exactly the question that pops up right now. I think, that's the main 
issue when including batteries in general (that's not news though :-). And 
the problem I see there is: There *is* no valid answer.

(Sorry if I seem to be just annoying. That's not intended, I'm just not 
carrying the good news.)

nd
-- 
package Hacker::Perl::Another::Just;print
qq~@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://www.perlig.de  #


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread André Malo
On Dienstag, 21. Mai 2019 13:46:34 CEST Christian Heimes wrote:
> On 21/05/2019 13.08, André Malo wrote:
> > On Montag, 20. Mai 2019 23:27:49 CEST Antoine Pitrou wrote:
> >> NNTP is still quite used (often through GMane, but probably not only)
> >> so
> >> I'd question the removal of nntplib.
> >> 
> >> cgitb used to be used by some Web frameworks in order to format
> >> exceptions.  Perhaps one should check if that's still the case.
> > 
> > I concur with both of those.
> > There's software in production using both. (It doesn't mean it's on pypi
> > or even free software).
> 
> There is always somebody who uses a feature. This argument blocks any
> innovation or cleanup. Victor just reminded me of https://xkcd.com/1172/
> .
> 
> * The removed modules will be available through PyPI.
> * You don't have to start to worry until Python 3.10 is released in over 3
> years from now. * The modules are fully supported in Python 3.8 and 3.9.
> Python 3.9 will reach EOL late 2026 or early 2027.

Correct. However that's a valid argument for the whole stdlib.

I agree with Victor  (in the other branch), that we should call the PEP how 
it's meant.
It effectively boils down to: "We don't want to maintain these modules 
anymore, volunteers step forward within the next 3 years". It would 
definitely draw a clear line and cut short a lot of discussions (like this 
one).
And it would be perfectly fine, for me at least.

nd
-- 
package Hacker::Perl::Another::Just;print
qq~@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://www.perlig.de  #


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lambda [was Re: PEP 8 modernisation]

2013-08-01 Thread André Malo
* Stephen J. Turnbull wrote:

> Chris Angelico writes:
>  > On Thu, Aug 1, 2013 at 5:58 PM, Alexander Shorin  
wrote:
>  > > fun = lambda i: i[1]
>  > > for key, items in groupby(sorted(items, key=fun), key=fun):
>  > >   print(key, ':', list(items))
>  >
>  > I'd do a direct translation to def here:
>  >
>  > def fun(i): return i[1]
>  > for key, items in groupby(sorted(items, key=fun), key=fun):
>  >   print(key, ':', list(items))
>
> As long as it's about readability, why not make it readable?
>
> def second(pair): return pair[1]
> for key, items in groupby(sorted(items, key=second), key=second):
> print(key, ':', list(items))
>
> I realize it's somewhat unfair (for several reasons) to compare that
> to Alexander's "fun = lambda i: i[1]", but I can't help feeling that
> in another sense it is fair.

Seems to run OT somewhat, but "second" is probably a bad name here. If the 
key changes, you have to rename it in several places (or worse, you DON'T 
rename it, and then the readability is gone).
Usually I'm using a name with "key" in it - describing what it's for, not 
how it's done. The minimal distance to its usage is supporting that, too.

nd
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] When to use EOFError?

2016-06-22 Thread André Malo
* Serhiy Storchaka wrote:

> There is a design question. If you read file in some format or with some
> protocol, and the data is ended unexpectedly, when to use general
> EOFError exception and when to use format/protocol specific exception?
>
> For example when load truncated pickle data, an unpickler can raise
> EOFError, UnpicklingError, ValueError or AttributeError. It is possible
> to avoid ValueError or AttributeError, but what exception should be
> raised instead, EOFError or UnpicklingError? Maybe convert all EOFError
> to UnpicklingError? Or all UnpicklingError caused by unexpectedly ended
> input to EOFError? Or raise EOFError if the input is ended after
> completed opcode, and UnpicklingError if it contains truncated opcode?

I often concatenate multiple pickles into one file. When reading them, it 
works like this:

try:
while True:
yield pickle.load(fp)
except EOFError:
pass

In this case the truncation is not really unexpected. Maybe it should 
distinguish between truncated-in-the-middle and truncated-because-empty.

(Same goes for marshal)

Cheers,
-- 
Real programmers confuse Christmas and Halloween because
DEC 25 = OCT 31.  -- Unknown

  (found in ssl_engine_mutex.c)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] When to use EOFError?

2016-06-26 Thread André Malo
* Serhiy Storchaka wrote:

> On 22.06.16 19:22, André Malo wrote:
> > I often concatenate multiple pickles into one file. When reading them,
> > it works like this:
> >
> > try:
> >  while True:
> >  yield pickle.load(fp)
> > except EOFError:
> >  pass
> >
> > In this case the truncation is not really unexpected. Maybe it should
> > distinguish between truncated-in-the-middle and
> > truncated-because-empty.
> >
> > (Same goes for marshal)
>
> This is interesting application, but works only for non-truncated data.
> If the data is truncated, you just lose the last item without a notice.

Yes (as said). In my case it's typically not a problem, because I write them 
myself right before reading them. It's a basically about spooling data to 
disk in order to keep them out of the RAM.
However, because of the truncation issue it would be nice, to have a 
distinction between no-data and truncated-data.

Cheers,
-- 
Winnetous Erbe: <http://pub.perlig.de/books.html#apache2>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Imports with underscores

2017-01-09 Thread André Malo
* Steve Holden wrote:

> One of my developers recently submitted a pull request incuding a number
> of lines like
>
> import os as _os
>
> When I asked him why he suggested a) this would improve encapsulation,
> and b) the practice was supported in the stdlib. Further investigation
> reveals that some modules (e.g. argparse, crypt, difflib, random) do use
> this technique, but it is far from universal.
>
> So I thought it would be useful to get input from current devs about the
> value of this practice, since to me it seems somewhat anti-pythonic. What
> advantages does it confer?

For me (in favor of underscored imports), the following items apply:

- the imports are usually not part of the exported API. (If they are, I 
  specifically do not underscore them)

- __all__ was referenced in other answers. However, it only protects from a 
  single use case (import *). It does not help you directly with shells 
  (dir(), ipython tab expansion (?)) and it's easy to ignore if you look at 
  the source code itself (because, let's face it, documentation often 
  sucks).

- __all__ again: it's tedious and error-prone to maintain. I often found
  places in my own code where it was plain wrong. (pylint helps these days 
  against wrongness, but not against incompleteness)

- In my code (inside the module): I usually know exactly what 
  variable is a module and what is not (by the underscores)

- Also in my code - from time to time the modules steal good names for local 
  variables, underscoring also solved this problem for me.

Cheers,
nd
-- 
die (eval q-qq:Just Another Perl Hacker
:-)

# André Malo, <http://pub.perlig.de/> #
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-13 Thread André Malo
Steve Dower wrote:
> On 11Apr2020 0025, Antoine Pitrou wrote:
> > On Fri, 10 Apr 2020 23:33:28 +0100
> > 
> > Steve Dower  wrote:
> >> On 10Apr2020 2055, Antoine Pitrou wrote:
> >>> On Fri, 10 Apr 2020 19:20:00 +0200
> >>> 
> >>> Victor Stinner  wrote:
>  Note: Cython and cffi should be preferred to write new C extensions.
>  This PEP is about existing C extensions which cannot be rewritten with
>  Cython.
> >>> 
> >>> Using Cython does not make the C API irrelevant.  In some
> >>> applications, the C API has to be low-level enough for performance.
> >>> Whether the application is written in Cython or not.
> >> 
> >> It does to the code author.
> >> 
> >> The point here is that we want authors who insist on coding against the
> >> C API to be aware that they have fewer compatibility guarantees [...]
> > 
> > Yeah, you missed the point of my comment here.  Cython *does* call into
> > the C API, and it's quite insistent on performance optimizations too.
> > Saying "just use Cython" doesn't make the C API unimportant - it just
> > hides it from your own sight.
> 
> It centralises the change. I have no problem giving Cython access to
> things that we discourage every developer from using, provided they
> remain responsive to change and use the special access responsibly (e.g.
> by not touching reserved fields at all).

It appears to me that this whole line of argument is contradicting the purpose 
of the whole idea. What am I missing?

For one thing, if you open up APIs for Cython, they're open for everybody 
(Cython being "just" another C extension).
More to the point: The ABIs have the same problem as they have now, regardless 
how responsive the Cython developers are. Once you compiled the extension, 
you're using the ABI and are supposedly not required to recompile to stay 
compatible.

So, where I'm getting at is: Either you open up to everybody or nobody. In C 
there's not really an in-between.

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S2BRL2GJGRNK5WXCSZPNLLMN4LGA5KTN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Steve Dower wrote:

> On a policy level, we don't make changes that would break users of the C 
> API. Because we can't track everyone who's using it, we have to assume 
> that everything is used and any change will cause breakage.
> 
> To make sure it's possible to keep developing CPython, we declare parts 
> of the API off limits (typically by prepending them with an underscore). 
> If you use these, and you break, we're sorry but we aren't going to fix it.
> 
> This line of discussion is basically saying that we would designate a 
> broader section of the API that is off limits, most likely the parts 
> that are only useful for increased performance (rather than increased 
> functionality). We would then specifically include the Cython 
> team/volunteers in discussions about how to manage changes to these 
> parts of the API to avoid breaking them, and possibly do simultaneous 
> releases to account for changes so that their users have more time to 
> rebuild.
> 
> Effectively, when we change our APIs, we would break everyone except 
> Cython because we've worked with them to avoid the breakage. Anyone else 
> using it has to make their own effort to follow CPython development and 
> detect any breakage themselves (just like today).
> 
> So probably the part you're missing is where we would give ourselves 
> permission to break more APIs in a release, while simultaneously 
> encouraging people to use Cython as an isolation layer from those breaks.

The encouraging part is not working for me :-) And seriously, my gut tells me, 
we're split at 50/50 here. People usually write C for a reason and Cython is 
not. For, let's say, half of the cases that's fine, speeding up inner loops 
and all that, which not touching the C level at all. The other half wants to 
solve different issues.

I think, it does not serve well as a policy for CPython. Since we're talking 
hypotheticals right now, if Cython vanishes tomorrow, we're kind of left empty 
handed. Such kind of a runtime, if considered part of the compatibility 
"promise", should be provided by the core itself, no?
A good way to test that promise (or other implications like performance) might 
also be to rewrite the standard library extensions in Cython and see where it 
leads.

I personally see myself using the python-provided runtime (types, methods, 
GC), out of convenience (it's there, so why not use it). The vision of the 
future outlined here can easily lead to backing off from that and rebuilding 
all those things and really only keep touchpoints with python when it comes to 
interfacing with python itself. It's probably even desirable that way. But 
definitely more work (for an extension author).

As a closing word, I don't mind either way. IOW I'm not complaining. I'm just 
putting more opinion from the "outside" into the ring. Thanks for listening 
:-)

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZDPNR3PO4RVQ3EHQQLEFEKUVBF72A2MX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Stefan Behnel wrote:
> André Malo schrieb am 14.04.20 um 13:39:
> 
> > I think, it does not serve well as a policy for CPython. Since we're
> > talking 
 hypotheticals right now, if Cython vanishes tomorrow, we're
> > kind of left empty handed. Such kind of a runtime, if considered part of
> > the compatibility "promise", should be provided by the core itself, no?
> 
> 
> There was some discussion a while ago about integrating a stripped-down
> variant of Cython into CPython's stdlib. I was arguing against that because
> the selling point of Cython is really what it is, and stripping that down
> wouldn't lead to something equally helpful for users.
> 
> I think it's good to have separate projects (and, in fact, it's more than
> one) deal with this need.
> 
> In the end, it's an external tool, [...]

Thank you, that is my point exactly. It's the same "external" as everything 
else. I'm still trying to understand where to separate the different sets of 
"external".

> 
> > A good way to test that promise (or other implications like performance)
> > might 
 also be to rewrite the standard library extensions in Cython and
> > see where it leads.
> 
> 
> Not sure I understand what you're saying here. stdlib extension modules are
> currently written in C, with a bit of code generation. How is that
> different? 

They are C extensions like the ones everybody could write. They should use the 
same APIs. What I'm saying is, that it would be a good test if the APIs are 
good enough (for everybody else). If, say, Cython is recommended, some attempt 
should be made to achieve the same results with Cython. Or some other sets of 
APIs which are considered for "the public".

I don't think, the current stdlib modules restrict themselves to a limited 
API. The distinction between "inside" and "outside" bothers me.


> > I personally see myself using the python-provided runtime (types, methods,
> > 
 GC), out of convenience (it's there, so why not use it). The vision of
> > the future outlined here can easily lead to backing off from that and
> > rebuilding all those things and really only keep touchpoints with python
> > when it comes to interfacing with python itself. It's probably even
> > desirable that way
> 
> That's actually not an uncommon thing to do. Some packages really only use
> Cython or pybind11 to wrap their otherwise native C or C++ code. It's a
> choice given specific organisational/project/developer constraints, and
> choices are good.

Agreed. Nevertheless, the choices are going to be limited by extra 
constraints.

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U7KHC7KV5GOOQ4ST5HI3MZKAW4CMRJ6S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread André Malo
Steve Dower wrote:
> On 14Apr2020 1557, André Malo wrote:
> 
> > Stefan Behnel wrote:
> > 
> >> André Malo schrieb am 14.04.20 um 13:39:
> >> 
> >>> A good way to test that promise (or other implications like
> >>> performance)
> >>> might
> >>> 
> >   also be to rewrite the standard library extensions in Cython and
> >   
> >>> see where it leads.
> >>
> >>
> >>
> >>
> >> Not sure I understand what you're saying here. stdlib extension modules
> >> are
 currently written in C, with a bit of code generation. How is that
> >> different?
> > 
> > 
> > They are C extensions like the ones everybody could write. They should use
> > the
 same APIs. What I'm saying is, that it would be a good test if the
> > APIs are good enough (for everybody else). If, say, Cython is
> > recommended, some attempt should be made to achieve the same results with
> > Cython. Or some other sets of APIs which are considered for "the public".
> > 
> > I don't think, the current stdlib modules restrict themselves to a
> > limited
> > API. The distinction between "inside" and "outside" bothers me.
> 
> 
> It should not bother you. The standard library is not a testing ground 
> for the public API - it's a layer to make those APIs available to users 
> in a reliable, compatible format. Think of it like your C runtime, which 
> uses a lot of system calls that have changed far more often than libc.

I can agree up to a certain level. There are extensions and there are 
extensions, see below.

> 
> We can change the interface between the runtime and the included modules 
> as frequently as we like, because it's private. And we do change them, 
> and the changes go unnoticed because we adapt both sides of the contract 
> at once. For example, we recently changed the calling conventions for 
> certain functions, which didn't break anyone because we updated the 
> callers as well. And we completely reimplemented stat() emulation on 
> Windows recently, which wasn't incompatible because the public part of 
> the API didn't change (except to have fewer false errors).
> 
> Modules that are part of the core runtime deliberately use private APIs 
> so that other extension modules don't have to. It's not any sort of 
> unfair advantage - it's a deliberate aspect of the software's design.

Ah, hmm, maybe I was not clear enough. I was talking about extensions like 
itertools or datetime. Not core builtins like sys or the type system. I think, 
there's a difference. People do use especially the former ones also as a 
template how things are done "correctly".

I agree, it's easy enough to change everything at once, assuming a good test 
suite :-)

Cheers,
nd

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZEYZX7PRCACRZEZIHU35CLWIPHQBALDV/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-Dev] PEP 389: argparse - new command line parsing module

2009-10-03 Thread André Malo
* Steven D'Aprano wrote:

> You don't need a comment warning that you are catching SystemExit
> because parse_args raises SystemExit, any more than you need a comment
> saying that you are catching ValueError because some function raises
> ValueError. The fact that you are catching an exception implies that
> the function might raise that exception. A comment like:
>
> "Catching SystemExit because parse_args() throws SystemExit on parser
> errors."
>
> is up them with comments like this:
>
> x += 1  # Add 1 to x.

It's semantically different. You usually don't catch SystemExit directly, 
because you want your programs to be stopped. Additionally, a library 
exiting your program is badly designed, as it's unexpected. Thatswhy such a 
comment is useful.

Here's what I'd do: I'd subclass SystemExit in this case and raise the 
subclass from argparse. That way all parties here should be satisifed. (I 
do the same all the time in my signal handlers - that's another reason I'd 
rather not catch SystemExit directly as well :-)

nd
-- 
"Umfassendes Werk (auch fuer Umsteiger vom Apache 1.3)"
  -- aus einer Rezension


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removal of Win32 ANSI API

2010-11-11 Thread André Malo
On Thursday 11 November 2010 20:50:35 Martin v. Löwis wrote:
> > Even if I hate the MBCS encoding, because it replaces undecodable
> > characters by similar glyphs by default, I'm not certain that it is a
> > good idea to drop the bytes API. Can it be a problem to port programs
> > from Python2 to Python3? Do major Python2 programs/libraries rely on the
> > bytes API?
>
> I don't actually know for a fact, but I expect that the answer is "no".
>
> The questions is: where do file names typically come from? My guess
> is that they come from
> a) hard-coded strings in the source code
> b) command line arguments/environment variables

[...]

> In case b), they will be Unicode strings in Python 3.

But not neccessarily with unicode semantics if I get the discussions about the 
environment topic right.

Additionally:

d) Over a socket (like the HTTP protocol) -> Bytes.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime module enhancements

2007-03-09 Thread André Malo
* Brett Cannon wrote:

> On 3/9/07, Christian Heimes <[EMAIL PROTECTED]> wrote:
> >  * What do you think about including PyTz in the Python core? PyTz is
> > really, REALLY useful when one has to deal with time zones.
> > http://pytz.sourceforge.net/
>
> What is wrong with datetime's tzinfo objects?

There aren't any. pytz fills that gap.

But I think, pytz has no place in the stdlib anyway, because it has 
reasonably different release cycles (every time the timezone tables 
change).

nd
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] deprecate commands.getstatus()

2007-03-22 Thread André Malo
* Titus Brown wrote:

> On Thu, Mar 22, 2007 at 02:47:58PM -0700, Guido van Rossum wrote:
> -> On 3/22/07, Michael Foord <[EMAIL PROTECTED]> wrote:
> -> > Guido van Rossum wrote:
> -> > > Sure. os.fork() and the os.exec*() family can stay. But
> os.spawn*(), -> > > that abomination invented by Microsoft? I also hear
> no opposition -> > > against killign os.system() and os.popen()
> -> >
> -> > Except that 'os.system' is really easy to use and I use it rarely
> enough -> > that I *always* have to RTFM for subprocess which makes you
> jump through -> > a few more (albeit simple) hoops.
> ->
> -> So let's add subprocess.system() which takes care of the hoops (but
> -> still allows you more flexibility through optional keyword
> -> parameters).
>
> How would this differ from subprocess.call()?
>
>   http://docs.python.org/lib/node530.html

It doesn't implement the system() spec:


nd
-- 
Winnetous Erbe: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove HTTP 0.9 support

2010-12-16 Thread André Malo
* Antoine Pitrou wrote:

> Hello,
>
> I would like to remove HTTP 0.9 support from http.client and
> http.server. I've opened an issue at http://bugs.python.org/issue10711
> for that. Would anyone think it's a bad idea?
>
> (HTTP 1.0 was devised in 1996)

HTTP/0.9 support is still recommended (RFC 2616 is from 1999, but still 
current).

I'm wondering, why you would consider touching that at all. Is it broken? 
Does it stand in the way of anything? If not, why throw away a feature?

nd
-- 
Already I've seen people (really!) write web URLs in the form:
http:\\some.site.somewhere
[...] How soon until greengrocers start writing "apples $1\pound"
or something?   -- Joona I Palaste in clc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove HTTP 0.9 support

2010-12-16 Thread André Malo
On Thursday 16 December 2010 15:23:05 Antoine Pitrou wrote:
> On Thu, 16 Dec 2010 07:42:08 +0100
>
> André Malo  wrote:
> > * Antoine Pitrou wrote:
> > > Hello,
> > >
> > > I would like to remove HTTP 0.9 support from http.client and
> > > http.server. I've opened an issue at http://bugs.python.org/issue10711
> > > for that. Would anyone think it's a bad idea?
> > >
> > > (HTTP 1.0 was devised in 1996)
> >
> > HTTP/0.9 support is still recommended (RFC 2616 is from 1999, but still
> > current).
> >
> > I'm wondering, why you would consider touching that at all. Is it broken?
> > Does it stand in the way of anything? If not, why throw away a feature?
>
> Well, it complicates maintenance and makes fixing issues such as
> http://bugs.python.org/issue6791 less likely.

I'd vote for removing it from the client code and keeping it in the server.

> Note that the patch still accepts servers and clients which advertise
> themselves as 0.9 (using "HTTP/0.9" as a version string).

HTTP/0.9 doesn't *have* a version string.

GET /foo

is a HTTP/0.9 request.

GET /foo HTTP/0.9

isn't actually (it's a paradoxon, alright ;). It simply isn't a valid HTTP 
request, which would demand a 505 response.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove HTTP 0.9 support

2010-12-16 Thread André Malo
* Fred Drake wrote:

> On Thu, Dec 16, 2010 at 10:52 AM, André Malo  wrote:
> > I'd vote for removing it from the client code and keeping it in the
> > server.
>
> If it must be maintained anywhere, it should be in the client,
> according to the basic principle of "accept what you can, generate
> carefully."

*scratching head* exactly why I would keep support in the server.

nd
-- 
package Hacker::Perl::Another::Just;print
q...@{[reverse split/::/ =>__PACKAGE__]}~;

#  André Malo  #  http://www.perlig.de  #
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing the GIL (Me, not you!)

2007-09-13 Thread André Malo
* Christian Heimes wrote: 

> Pardon my ignorance but why does Python do reference counting for truly
> global and static objects like None, True, False, small and cached
> integers, sys and other builtins? If I understand it correctly these
> objects are never garbaged collected (at least they shouldn't) until the
> interpreter exits. Wouldn't it decrease the overhead and increase speed
> when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects?

The check what kind of object you have takes time, too. Right now, just 
counting up or down is most likely faster than that check on every refcount 
operation.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we do away with unbound methods in Py3k?

2007-11-24 Thread André Malo
* Greg Ewing wrote:

> Phillip J. Eby wrote:
> > class MoneyField(Field):
> > # does need staticmethod because two_decimal_places
> > # doesn't take a self
> > converter = staticmethod(two_decimal_places)
>
> Okay, I see what you mean now. But you could just as well wrap
> it in a function that takes self and discards it, 

I always thought, that this is exactly what staticmethod does.

> so I still 
> don't think staticmethod is essential in the absence of
> unbound methods.

Actually I don't see why those issues were bound together in the first 
place.

nd
-- 
Winnetous Erbe: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-16 Thread André Malo
* Nick Coghlan wrote:

> Eric Smith wrote:
> > The bad error message is a result of __format__ passing on unicode to
> > strftime.
> >
> > There are, of course, various ugly ways to work around this involving
> > nested format calls.
>
> I don't know if this fits your definition of "ugly workaround", but what
> if datetime.__format__ did something like:
>
>def __format__(self, spec):
>  encoding = None
>  if isinstance(spec, unicode):
>  encoding = 'utf-8'
>  spec = spec.encode(encoding)
>  result = strftime(spec, self)
>  if encoding is not None:
>  result = result.decode(encoding)
>  return result

Note that hardcoding utf-8 is a bad guess here as strftime(3) emits locale 
strings, so decoding will easily fail.

I guess, a clean and complete solution (besides re-implementing the whole 
thing) would be to resolve each single format character with strftime, 
decode according to the locale and re-assemble the result string piece by 
piece. Doh!

nd
-- 
> [...] weiß jemand zufällig, was der Tag DIV ausgeschrieben bedeutet?
DIVerses. Benannt nach all dem unstrukturierten Zeug, was die Leute da
so reinpacken und dann absolut positionieren ...
   -- Florian Hartig und Lars Kasper in dciwam
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-02-17 Thread André Malo
* Eric Smith wrote:

> André Malo wrote:
> > I guess, a clean and complete solution (besides re-implementing the
> > whole thing) would be to resolve each single format character with
> > strftime, decode according to the locale and re-assemble the result
> > string piece by piece. Doh!
>
> That's along the lines of what I was thinking.  strftime already does
> some of this to support %[zZ].
>
> But now that I look at time.strftime in py3k, it's converting the entire
> unicode string to a char string with PyUnicode_AsString, then converting
> back with PyUnicode_Decode.

Looks wrong to me, too... :-)

nd
-- 
$_=q?tvc!uif)%*|#Bopuifs!A`#~tvc!Xibu)%*|qsjou#Kvtu!A`#~tvc!KBQI!)*|~
tvc!ifmm)%*|#Qfsm!A`#~tvc!jt)%*|(Ibdlfs(~  # What the hell is JAPH? ;
@_=split/\s\s+#/;$_=(join''=>map{chr(ord(  # André Malo ;
$_)-1)}split//=>$_[0]).$_[1];s s.*s$_see;  #  http://www.perlig.de/ ;
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How best to handle the docs for a renamed module?

2008-05-16 Thread André Malo
* M.-A. Lemburg wrote: 


> On 2008-05-12 04:34, Brett Cannon wrote:
> > For the sake of argument, let's consider the Queue module. It is now
> > named queue. For 2.6 I plan on having both Queue and queue listed in
> > the index, with Queue deprecated with instructions to use the new
> > name.
> >
> > But what to do about all the references. Should we leave them pointing
> > at Queue to lessen confusion for people who read about some module on
> > some other site that isn't using the new name, or update everything in
> > 2.6 to use the new name?
>
> How hard would it be to add a redirects from the old pages to the
> new ones ?
>
> mod_rewrite does wonders - well, provided you find the right patterns...

The "pattern" can be a simple text file maintained in subversion::

  oldurl newurl
  ...

And then you can utilize RewriteMap to get that into the apache.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible bug in re module?

2008-05-20 Thread André Malo
* Dmitry Vasiliev wrote:

> I've just found a strange re behavior:
>  >>> import re
>  >>> re.sub("(?:ab|b|a)", "+", "cbacbabcabc")
>
> 'c++c++c+c'
>
>  >>> re.sub("(?:ab|b|a){2}", "+", "cbacbabcabc")
>
> 'c+c+c+c'
>
> In the last case |-separated expressions seems don't tried from left to
> right. Is it bug or just me?

Looks fine to me.

(Although if I understand that correctly, the userlist would be more 
appropriate).

nd
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: add odict to collections

2008-06-14 Thread André Malo
* Armin Ronacher wrote:

> Some reasons why ordered dicts are a useful feature:
>
>   - in XML/HTML processing it's often desired to keep the attributes of
> an tag ordered during processing.  So that input ordering is the
> same as the output ordering.
>
>   - Form data transmitted via HTTP is usually ordered by the position
> of the input/textarea/select field in the HTML document.  That
> information is currently lost in most Python web applications /
> frameworks.
>
>   - Eaiser transition of code from Ruby/PHP which have sorted
> associative arrays / hashmaps.
>
>   - Having an ordered dict in the standard library would allow other
> libraries support them.  For example a PHP serializer could return
> odicts rather then dicts which drops the ordering information.
> XML libraries such as etree could add support for it when creating
> elements or return attribute dicts.

I find this collection of cases pretty weak as an argument for implementing 
that in the stdlib. A lot of special purpose types would fit into such 
reasoning, but do you want to have all of them maintained here?

nd
-- 
Da fällt mir ein, wieso gibt es eigentlich in Unicode kein
"i" mit einem Herzchen als Tüpfelchen? Das wär sooo süüss!

 -- Björn Höhrmann in darw
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: add odict to collections

2008-06-15 Thread André Malo
* Guido van Rossum wrote:

> On Sat, Jun 14, 2008 at 4:57 PM, André Malo <[EMAIL PROTECTED]> wrote:
> > * Armin Ronacher wrote:
> >> Some reasons why ordered dicts are a useful feature:
> >>
> >>   - in XML/HTML processing it's often desired to keep the attributes
> >> of an tag ordered during processing.  So that input ordering is the
> >> same as the output ordering.
> >>
> >>   - Form data transmitted via HTTP is usually ordered by the position
> >> of the input/textarea/select field in the HTML document.  That
> >> information is currently lost in most Python web applications /
> >> frameworks.
> >>
> >>   - Eaiser transition of code from Ruby/PHP which have sorted
> >> associative arrays / hashmaps.
> >>
> >>   - Having an ordered dict in the standard library would allow other
> >> libraries support them.  For example a PHP serializer could return
> >> odicts rather then dicts which drops the ordering information.
> >> XML libraries such as etree could add support for it when creating
> >> elements or return attribute dicts.
> >
> > I find this collection of cases pretty weak as an argument for
> > implementing that in the stdlib. A lot of special purpose types would
> > fit into such reasoning, but do you want to have all of them maintained
> > here?
>
> No, but an ordered dict happens to be a *very* common thing to need,
> for a variety of reasons. So I'm +0.5 on adding this to the
> collections module. However someone needs to contribute working code.
> It would also be useful to verify that it actually fulfills the needs
> of some actual use case. Perhaps looking at how Django uses its
> version would be helpful.

FWIW, I'm working a lot in the contexts described above and I never needed 
ordered dicts so far (what do I have to do in order to need them?). I've 
found myself implementing, for example, mutlivaluedicts instead, several 
times.

nd
-- 
Real programmers confuse Christmas and Halloween because
DEC 25 = OCT 31.  -- Unknown

  (found in ssl_engine_mutex.c)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-12 Thread André Malo
* Matt Giuca wrote:

> Well from what I've seen, the only time Latin-1 naturally appears on the
> net is when you have a web page in Latin-1 (either explicit or inferred;
> and note that a browser like Firefox will infer Latin-1 if it sees only
> ASCII characters) with a form in it. Submitting the form, the browser
> will use Latin-1 to percent-encode the query string.

This POV is way too browser-centric...

> So if you write a web app and you don't have any non-ASCII characters or
> mention the charset, chances are you'll get Latin-1. But I would argue
> you're leaving things to chance and you deserve to get funny behaviour.
> If you do any of the following:
>
>- Use a non-ASCII character, encoded as UTF-8 on the page.
>- Send a Content-Type: ; charset=utf-8.
>- In HTML, set a  />. - In the form itself, set .
>
> then the browser will encode the form data as UTF-8. And most "proper"
> web pages should get themselves explicitly served as UTF-8.

... because

1) URL encoding is not limited to web forms at all

2) The web form encoding depends on the browser settings as well (for 
example, try playing around with the internet explorer settings regarding 
query encoding)

3) The process submitting the form may not be a browser at all

4) The web form may not be under your own control (Search engine forms are a 
common example here, e.g. "put this google form snippet onto your webpage")

5) Different cultures do not choose necessarily between latin-1 and utf-8. 
They deal more with things like, say KOI8-R or Big5.

etc pp

Besides all that and without any offense: "most proper" and "should do" and 
the implication that all web browsers behave the same way are not a good 
location to argue from when talking about implementing a standard ;)

nd
-- 
Wenn nur Ingenieure mit Diplom programmieren würden, hätten wir
wahrscheinlich weniger schlechte Software.
Wir hätten allerdings auch weniger gute Software.
   -- Felix von Leitner in dasr
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-13 Thread André Malo
* Matt Giuca wrote:

> > This POV is way too browser-centric...
>
> This is but one example. Note that I found web forms to be the least
> clear-cut example of choosing an encoding. Most of the time applications
> seem to be using UTF-8, and all the standards I have read are moving
> towards specifying UTF-8 (from being unspecified). I've never seen a
> standard specify or even recommend Latin-1.

Ahem. The HTTP standard does ;-)

> Where web forms are concerned, basically setting the form accept-charset
> or the page charset is the *maximum amount* of control you have over the
> encoding. As you say, it can be encoded by another page or the user can
> override their settings. Then what can you do as the server? Nothing ...

Guessing works pretty well in most of the cases.

> Exactly. This is exactly my point - Latin-1 is arbitrary from a standards
> point of view. It's just one of the many legacy encodings we'd like to
> forget. The UTFs are the only options which support all languages, and
> UTF-8 is the only ASCII-compatible (and therefore URI-compatible)
> encoding. So we should aim to support that as the default.

Latin-1 is not exactly arbitray. Besides being a charset - it maps 
one-to-one to octet values, hence it's commonly used to encode octets and 
is therefore a better fallback than every other encoding.

> I agree. However if there *was* a proper standard we wouldn't have to
> argue! "Most proper" and "should do" is the most confident we can be when
> dealing with this standard, as there is no correct encoding.

Well, the standard says, there are octets to be encoded. I find that proper 
enough.

> Does anyone have a suggestion which will be more compatible with the rest
> of the world than allowing the user to select an encoding, and defaulting
> to "utf-8"?

Default to latin-1 for decoding and utf-8 for encoding. This might be 
confusing though, so maybe you've asked the wrong question ;)

nd
-- 
Real programmers confuse Christmas and Halloween because
DEC 25 = OCT 31.  -- Unknown

  (found in ssl_engine_mutex.c)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-07-30 Thread André Malo
[I was pretty busy these days, so sorry for jumping in late again]

* Matt Giuca wrote: 

> 1. Leave it as it is. quote is Latin-1 if range(0,256), fallback to
> UTF-8. unquote is Latin-1.
> In favour: Anybody who doesn't reply to this thread
> Pros: Already implemented; some existing code depends upon ord values
> of string being the same as they were for byte strings; possible to
> hack around it.
> Cons: unquote is not inverse of quote; quote behaviour
> internally-inconsistent; garbage when unquoting UTF-8-encoded URIs.

> 2. Default to UTF-8.
> In favour: Matt Giuca, Brett Cannon, Jeroen Ruigrok van der Werven
> Pros: Fully working and tested solution is implemented; recommended by
> RFC 3986 for all future schemes; recommended by W3C for use with HTML;
> UTF-8 used by all major browsers; supports all characters; most
> existing code compatible by default; unquote is inverse of quote.
> Cons: By default, URIs may have invalid octet sequences (not possible
> to reverse).

Con: URI encoding does not encode characters.

>
> 3. quote default to UTF-8, unquote default to Latin-1.
> In favour: André Malo
> Pros: quote able to handle all characters; unquote able to handle all
> sequences. Cons: unquote is not inverse of quote; totally inconsistent.

I'm not in favour of that. I merely answered a question there ;)

I'm actually in favour of encoding bytes only back and forth. A useful 
extension would be *another* function which wraps quote/unquote and encodes 
and decodes characters.


> 4. quote accepts either bytes or str, unquote default to outputting
> bytes unless given an encoding argument.
> In favour: Bill Janssen
> Pros: Technically does what the spec says, which is treat it as an
> octet encoding.
> Cons: unquote will break most existing code; almost 100% of the time
> people will want it as a string.
>
> 
>
> I'll just comment on #4 since I haven't already. Let's talk about
> quote and unquote separately. For quote, I'm all for letting it accept
> a bytes as well as a str. That doesn't break anything or surprise
> anyone.
>
> For unquote, I think it will break a lot and surprise everyone. I
> think that while this may be "purely" the best option, it's pretty
> silly. I reckon the vast majority of users will be surprised when they
> see it spitting out a bytes object, and all that most people will do
> is decode it as UTF-8. Besides, while you're reading the RFCs as "URLs
> specify a method for encoding octet sequences", I'm reading them as
> "URLs specify a method for encoding strings, and leave the character
> encoding unspecified." The second reading supports the idea that
> unquote outputs a str.
>
> I'm also recommending we add unquote_to_bytes to do what you suggest
> unquote should do. (So either way we'll get both versions of unquote;
> I'm just suggesting the one called "unquote" do the thing everybody
> expects). But that's less of a priority so I want to commit these
> urgent fixes first.
>
> I'm basically saying just two things: 1. The standards are undefined;

That's still disputed...

> 2. Therefore we should pick the most useful and/or intuitive default.
> IMHO choosing UTF-8 *is* the most useful AND intuitive, and will be
> more so in the future when more technologies are hard-coded as UTF-8
> (which this RFC recommends they do in the future).

See my suggestion above.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-08-06 Thread André Malo
* Bill Janssen wrote: 


> > I'm far less concerned about
> > the decision with regards to unquote_to_bytes/quote_from_bytes, as
> > those are new features which can wait.
>
> Forgive me, but those are the *old* features, which must be there.

This whole discussion circles too much, I think. Maybe it should be pepped?

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urllib.quote and unquote - Unicode issues

2008-08-06 Thread André Malo
* Matt Giuca wrote: 

> > This whole discussion circles too much, I think. Maybe it should be
> > pepped?
>
> The issue isn't circular. It's been patched and tested, then a whole lot
> of people agreed including Guido. Then you and Bill wanted the bytes
> functionality back. So I wrote that in there too, and Bill at least said
> that was sufficient.
>
> On Thu, Jul 31, 2008, Bill Janssen wrote:
> > But:  OK, OK, I yield.  Though I still think this is a bad idea, I'll
> > shut up if we can also add "unquote_as_bytes" which returns a byte
> > sequence instead of a string.  I'll just change my code to use that.
>
> We've reached, to quote Guido, "as close as consensus as we can get on
> this issue".

There are a lot of quotes around. Including "After the most recent flurry of 
discussion I've lost track of what's the right thing to do."
But I don't talk for other people.

> There is a bug in Python. I've proposed a working fix, and nobody else
> has.

Well, you proposed a patch ;-)
It may fix things, it will break a lot. While this was denied over and over 
again, it's still gonna happen, because the axioms are still not accounting 
for the reality.

> I made all the changes the community suggested. 

I don't think so.

> What more needs to be discussed here?

Huh? You feel, the discussion is over? Then why are there still open 
questions? I admit, a lot of discussion is triggered by the assessments 
you're stating in your posts. Don't take it as a personal offense, it's a 
simple observation. There were made a lot of statements and nobody even 
bothered to substantiate them. A PEP could fix that.

But it's a lost issue now. Nobody comes up with an alternative (for various 
reasons, I suppose). So go ahead, EOD from my side.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Attribute error: providing type name

2008-11-30 Thread André Malo
* Christian Heimes wrote:

> Adam Olsen wrote:
> > I'm sure you'll get support for this, unless it's a really
> > inconvenient spot that requires a gross hack to print the type name.
> > Post a patch on the bug tracker.
>
> So far I can see only one argument against the proposed idea: doc tests.
>   The modified exception message would break existing doc tests.

As the exception text is officially not part of the API, I'd say, let them.

nd
-- 
Winnetous Erbe: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread André Malo
* Adam Olsen wrote: 


> On Thu, Dec 4, 2008 at 1:02 PM, Toshio Kuratomi <[EMAIL PROTECTED]> 
wrote:
> > I opened up bug http://bugs.python.org/issue4006 a while ago and it was
> > suggested in the report that it's not a bug but a feature and so I
> > should come here to see about getting the feature changed :-)
> >
> > I have a specific problem with os.environ and a somewhat less important
> > architectural issue with the unicode/bytes handling in certain os.*
> > modules.  I'll start with the important one:
> >
> > Currently in python3 there's no way to get at environment variables
> > that are not encoded in the system default encoding.  My understanding
> > is that this isn't a problem on Windows systems but on *nix this is a
> > huge problem.  environment variables on *nix are a sequence of non-null
> > bytes.  These bytes are almost always "characters" but they do not have
> > to be.  Further, there is nothing that requires that the characters be
> > in the same encoding; some of the characters could be in the UTF-8
> > character set while others are in latin-1, shift-jis, or big-5.
>
> Multiple encoding environments are best described as "batshit insane".
>  It's impossible to handle any of it correctly *as text*, which is why
> UTF-8 is becoming a universal standard.  For everybody's sanity python
> should continue to push it.

Here's an example which will become popular soon, I guess: CGI scripts and, 
of course WSGI applications. All those get their environment in an unknown 
encoding. In the worst case one can blow up the application by simply 
sending strange header lines over the wire. But there's more: consider 
running the server in C locale, then probably even a single 8 bit char 
might break something (?).

> However, some pragmatism is also possible.  Many uses of PATH may
> allow it to be treated as black-box bytes, rather than text.  The
> minimal solution I see is to make os.getenv() and os.putenv() switch
> to byte modes when given byte arguments, as os.listdir() does.  This
> use case doesn't require the ability to iterate over all environment
> variables, as os.environb would allow.
>
> I do wonder if controlling the environment given to a subprocess
> requires os.environb, but it may be too obscure to really matter.

IMHO, environment variables are no text. They are bytes by definition and 
should be treated as such.
I know, there's windows having unicode enabled env vars on demand, but 
there's only trouble with those over there in apache's httpd (when passing 
them to CGI scripts, oh well...).

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-04 Thread André Malo
* Adam Olsen wrote: 

> On Thu, Dec 4, 2008 at 2:09 PM, André Malo <[EMAIL PROTECTED]> wrote:

> > Here's an example which will become popular soon, I guess: CGI scripts
> > and, of course WSGI applications. All those get their environment in an
> > unknown encoding. In the worst case one can blow up the application by
> > simply sending strange header lines over the wire. But there's more:
> > consider running the server in C locale, then probably even a single 8
> > bit char might break something (?).
>
> I think that's an argument that the framework should reencode all
> input text into the correct system encoding before passing it on to
> the CGI script or WSGI app.  If the framework doesn't have a clear way
> to determine the client's encoding then it's all just gibberish
> anyway.  A HTTP 400 or 500 range error code is appropriate here.

Duh.
See, you're already mixing different encodings and creating issues here! 
You're talking about client encoding (whatever that is) with correct system 
encoding (whatever that is, too) in the same paragraph and assume they are 
the same or compatible.

There are several points here:

- there is no clear way to get a single client encoding for the whole HTTP 
  transaction (headers + body), because *there is none*. If the whole 
  header set matches the same encoding, it's more or less luck.

- there is no correct system encoding either. As said, I prefer running my 
  servers in C locale, so it's all ascii. In fact, it shouldn't matter. The 
  locale should not have anything to do with an application called over the 
  network.

- A 400 or 500 response for a header containing something like my name is 
  not appropriate.

- Octets in HTTP headers are allowed. And they are what they are -
  octets. The interpretation has to be left to the application, not the 
  framework.


>
> >> However, some pragmatism is also possible.  Many uses of PATH may
> >> allow it to be treated as black-box bytes, rather than text.  The
> >> minimal solution I see is to make os.getenv() and os.putenv() switch
> >> to byte modes when given byte arguments, as os.listdir() does.  This
> >> use case doesn't require the ability to iterate over all environment
> >> variables, as os.environb would allow.
> >>
> >> I do wonder if controlling the environment given to a subprocess
> >> requires os.environb, but it may be too obscure to really matter.
> >
> > IMHO, environment variables are no text. They are bytes by definition
> > and should be treated as such.
> > I know, there's windows having unicode enabled env vars on demand, but
> > there's only trouble with those over there in apache's httpd (when
> > passing them to CGI scripts, oh well...).
>
> Environment variables have textual names, are set via text, frequently

Well, think about my example again. The friendly way to maintain them is not 
the issue. The problems arise at least when the variables are set by an 
attacker.

> contain textual file names or paths, and my shell (bash in
> gnome-terminal on ubuntu) lets me put unicode text in just fine.  The
> underlying APIs may use bytes, but they're *intended* to be encoded
> text.

Yes, encoded text == bytes. No, they're intended to be c-strings. And well,  
even if we assume that they should contain text (as in encoded unicode), 
their meaning is application specific and so is the encoding (even if it's 
mixed).

What I'm saying is: I don't see much use for unicode APIs for the 
environment at all, because I don't know what's in there before inspecting 
them. And apparently the only reliable way to inspect them is via a byte 
oriented API.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread André Malo
* Nick Coghlan wrote:

> Toshio Kuratomi wrote:
> > Note 2: If there isn't a parallel API on all platforms, for instance,
> > Guido's proposal to not have os.environb on Windows, then you'll still
> > have to have a platform specific check. (Likely you should try to
> > access os.evironb in this instance and if it doesn't exist, use
> > os.environ instead... and remember that you need to either change
> > os.environ's data into str type or change os.environb's data into byte
> > type.)
>
> Note that this is why I personally think the binary API variants
> *should* exist on Windows, just with the sense of the system encoding
> flipped around.
>
> That is, on *nix:
> - underlying OS API uses bytes
> - binary API just passes values straight through
> - Unicode API uses the system encoding to encode Unicode names and
> values to be passed to the OS API and to decode bytes names and values
> received from the OS API
>
> While on Windows:
> - underlying OS API uses Unicode
> - Unicode API just passes values straight through
> - binary API uses the system encoding to decode bytes names and values
> to be passed to the OS API and to encode Unicode names and values
> received from the OS API

Now that is somewhat strange. That way you'll have two unreliable APIs and 
need to switch depending on the platform again.

nd
-- 
+[>++<-]>++>++[><-]>++.<[>++<-]>+++.--.
+.<
<.>[><-]>---.<+++[><-]>+.+.
+.<+++[><-]>.---.<+++[><-]>
+.<<.>+[>---<-]>+.<+[><-]>+.<+++[><-]>+.--.<<.>++
[>--<-]>.<+[>+<-]>.++..--.<+++[><-]>+.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread André Malo
* M.-A. Lemburg wrote: 


> On 2008-12-09 09:41, Anders J. Munch wrote:
> > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote:
>  try:
>   files = os.listdir(somedir, errors = strict)
>  except OSError as e:
>   log()
>   files = os.listdir(somedir)
> >
> > Instead of a codecs error handler name, how about a callback for
> > converting bytes to str?
> >
> > os.listdir(somedir, decoder=bytes.decode)
> > os.listdir(somedir, decoder=lambda b: b.decode(preferredencoding,
> > errors='xmlcharrefreplace')) os.listdir(somedir, decoder=repr)
> >
> > ISTM that would be simpler and more flexible than going over the
> > codecs registry.  One caveat though is that there's no obvious way of
> > telling listdir to skip a name.  But if the default behaviour for
> > decoder=None is to skip with a warning, then the need to explicitly
> > ask for files to be skipped would be small.
> >
> > Terry's example would then be:
>  try:
>   files = os.listdir(somedir, decoder=bytes.decode)
>  except UnicodeDecodeError as e:
>   log()
>   files = os.listdir(somedir)
>
> Well, this is not too far away from just putting the whole decoding
> logic into the application directly:
>
> files = [filename.decode(filesystemencoding, errors='warnreplace')
>  for filename in os.listdir(dir)]
>
> (or os.listdirb() if that's where the discussion is heading)
>
> ... and that also tells us something about this discussion: we're
> trying to come up with some magic to work around writing two
> lines of Python code.
>
> I'd just have all the os APIs return bytes and leave whatever
> conversion to Unicode might be necessary to a higher level API.

[...]

What I'm saying ;-)

+1.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote: 

> UTF-8 in percent encodings is becoming a defacto standard.  Otherwise
> the browser has to display the percent escapes in the address bar,
> rather than the intended text.

Duh! The address bar should contain the URL, which *is* the intended text. 
The escapes are there for a reason. If I pass some octets using percent 
escapes via the query string or request body, it's not text, not even 
intended. It's still a collection of octets. Translating them back (and 
forth when I press enter in the address bar) is a pretty ambigious 
operation and therefore pretty wrong.

The defacto standard does not exist. There's a real one instead: RFC 2396.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote:

> On Fri, Dec 12, 2008 at 2:11 AM, André Malo  wrote:
> > * Adam Olsen wrote:
> >> UTF-8 in percent encodings is becoming a defacto standard.  Otherwise
> >> the browser has to display the percent escapes in the address bar,
> >> rather than the intended text.
> >
> > Duh! The address bar should contain the URL, which *is* the intended
> > text. The escapes are there for a reason. If I pass some octets using
> > percent escapes via the query string or request body, it's not text,
> > not even intended. It's still a collection of octets. Translating them
> > back (and forth when I press enter in the address bar) is a pretty
> > ambigious operation and therefore pretty wrong.
> >
> > The defacto standard does not exist. There's a real one instead: RFC
> > 2396.
>
> All the heaps of people using non-english wikipedia sites might
> disagree with you.  There's only, what, a few *million* pages that
> would be affected?

I'm not sure what you're trying to pull here. Is that supposed to be an 
argument? There's no page affected at all. It's a browser UI issue, not a 
page issue.

And even if it were interesting at all, how the URL escapes are displayed in 
the address bar, those millions of people would favourite KOI8-R or Big 5 
over UTF-8 if you would ask them.

Which leads to the exact point: The browser cannot know, nor should it even. 
It's opaque. The only entity which needs to understand the encoding of URL 
percent escapes in query or request body is the *server* selecting the 
resource.

But I'm sure I'm not telling you any news here.

nd
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: <http://pub.perlig.de/books.html#apache2>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote:

> On Fri, Dec 12, 2008 at 9:47 PM, André Malo  wrote:
> > * Adam Olsen wrote:
> >> On Fri, Dec 12, 2008 at 2:11 AM, André Malo  wrote:
> >> > * Adam Olsen wrote:
> >> >> UTF-8 in percent encodings is becoming a defacto standard. 
> >> >> Otherwise the browser has to display the percent escapes in the
> >> >> address bar, rather than the intended text.
> >> >
> >> > Duh! The address bar should contain the URL, which *is* the intended
> >> > text. The escapes are there for a reason. If I pass some octets
> >> > using percent escapes via the query string or request body, it's not
> >> > text, not even intended. It's still a collection of octets.
> >> > Translating them back (and forth when I press enter in the address
> >> > bar) is a pretty ambigious operation and therefore pretty wrong.
> >> >
> >> > The defacto standard does not exist. There's a real one instead: RFC
> >> > 2396.
> >>
> >> All the heaps of people using non-english wikipedia sites might
> >> disagree with you.  There's only, what, a few *million* pages that
> >> would be affected?
> >
> > I'm not sure what you're trying to pull here. Is that supposed to be an
> > argument? There's no page affected at all. It's a browser UI issue, not
> > a page issue.
> >
> > And even if it were interesting at all, how the URL escapes are
> > displayed in the address bar, those millions of people would favourite
> > KOI8-R or Big 5 over UTF-8 if you would ask them.
> >
> > Which leads to the exact point: The browser cannot know, nor should it
> > even. It's opaque. The only entity which needs to understand the
> > encoding of URL percent escapes in query or request body is the
> > *server* selecting the resource.
> >
> > But I'm sure I'm not telling you any news here.
>
> You're arguing that text should be an opaque entity..

No, actually I'm not. I'm arguing that escapes are opaque.

> We've wasted enough of everybody's time on this already, I'm not going
> to continue on this thread. 

Agreed.

nd
-- 
Da fällt mir ein, wieso gibt es eigentlich in Unicode kein
"i" mit einem Herzchen als Tüpfelchen? Das wär sooo süüss!

 -- Björn Höhrmann in darw
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
On Wednesday 29 February 2012 20:17:05 Raymond Hettinger wrote:
> On Feb 27, 2012, at 10:53 AM, Victor Stinner wrote:
> > A frozendict type is a common request from users and there are various
> > implementations.
>
> ISTM, this request is never from someone who has a use case.
> Instead, it almost always comes from "completers", people
> who see that we have a frozenset type and think the core devs
> missed the ObviousThingToDo(tm).  Frozendicts are trivial to
> implement, so that is why there are various implementations
> (i.e. the implementations are more fun to write than they are to use).
>
> The frozenset type covers a niche case that is nice-to-have but
> *rarely* used.  Many experienced Python users simply forget
> that we have a frozenset type.  We don't get bug reports or
> feature requests about the type.  When I do Python consulting
> work, I never see it in a client's codebase.  It does occasionally
> get discussed in questions on StackOverflow but rarely gets
> offered as an answer (typically on variants of the "how do you
> make a set-of-sets" question).  If Google's codesearch were still
> alive, we could add another datapoint showing how infrequently
> this type is used.


Here are my real-world use cases. Not for security, but for safety and 
performance reasons (I've built by own RODict and ROList modeled after 
dictproxy):

- Global, but immutable containers, e.g. as class members

- Caching. My data container objects (say, resultsets from a db or something) 
  usually inherit from list or dict (sometimes also set) and are cached 
  heavily. In order to ensure that they are not modified (accidentially), I 
  have to choices: deepcopy or immutability. deepcopy is so expensive, that 
  it's often cheaper to just leave out the cache. So I use immutability. (oh 
  well, the objects are further restricted with __slots__)

I agree, these are not general purpose issues, but they are not *rare*, I'd 
think.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
On Thursday 01 March 2012 14:07:10 Victor Stinner wrote:
> > Here are my real-world use cases. Not for security, but for safety and
> > performance reasons (I've built by own RODict and ROList modeled after
> > dictproxy):
> >
> > - Global, but immutable containers, e.g. as class members
>
> I attached type_final.patch to the issue #14162 to demonstrate how
> frozendict can be used to implement a "read-only" type. Last version:
> http://bugs.python.org/file24696/type_final.patch

Oh, hmm. I rather meant something like that:

"""
class Foo:
some_mapping = frozendict(
blah=1, blub=2
)

or as a variant:

def zonk(some_default=frozendict(...)):
...

or simply a global object:

baz = frozendict(some_immutable_mapping)
"""

I'm not sure about your final types. I'm using __slots__ = () for such things 
(?)

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
On Thursday 01 March 2012 15:17:35 Serhiy Storchaka wrote:
> 01.03.12 11:29, André Malo написав(ла):
> > - Caching. My data container objects (say, resultsets from a db or
> > something) usually inherit from list or dict (sometimes also set) and are
> > cached heavily. In order to ensure that they are not modified
> > (accidentially), I have to choices: deepcopy or immutability. deepcopy is
> > so expensive, that it's often cheaper to just leave out the cache. So I
> > use immutability. (oh well, the objects are further restricted with
> > __slots__)
>
> This is the first rational use of frozendict that I see. However, a deep
> copy is still necessary to create the frozendict. For this case, I
> believe, would be better to "freeze" dict inplace and then copy-on-write
> it.

In my case it's actually a half one. The data mostly comes from memcache ;) 
I'm populating the object and then I'm done with it. People wanting to modify 
it, need to copy it, yes. OTOH usually a shallow copy is enough (here).

Funnily my ROList actually provides a "sorted" method instead of "sort" in 
order to create a sorted copy of the list.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
On Thursday 01 March 2012 15:54:01 Victor Stinner wrote:

> > I'm not sure about your final types. I'm using __slots__ = () for such
> > things
>
> You can still replace an attribute value if a class defines __slots__:
> >>> class A:
>
> ...   __slots__=('x',)
> ...   x = 1
> ...
>
> >>> A.x=2
> >>> A.x
>
> 2

Ah, ok, I missed that. It should be fixable with a metaclass. Not very nicely, 
though.

nd
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
* Serhiy Storchaka wrote:

> 01.03.12 16:47, André Malo написав(ла):
> > On Thursday 01 March 2012 15:17:35 Serhiy Storchaka wrote:
> >> This is the first rational use of frozendict that I see. However, a
> >> deep copy is still necessary to create the frozendict. For this case,
> >> I believe, would be better to "freeze" dict inplace and then
> >> copy-on-write it.
> >
> > In my case it's actually a half one. The data mostly comes from
> > memcache ;) I'm populating the object and then I'm done with it. People
> > wanting to modify it, need to copy it, yes. OTOH usually a shallow copy
> > is enough (here).
>
> What if people modify dicts in deep?

that's the "here" part. They can't [1]. These objects are typically ROLists 
of RODicts. Maybe nested deeper, but all RO* or other immutable types.

I cheated, by deepcopying always in the cache, but defining __deepcopy__ for 
those RO* objects as "return self".

nd

[1] Well, an attacker could, because it's still based on regular dicts and 
lists. But thatswhy it's not a security feature, but a safety net (here).
-- 
"Solides und umfangreiches Buch"
  -- aus einer Rezension

<http://pub.perlig.de/books.html#apache2>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a frozendict builtin type

2012-03-01 Thread André Malo
* Guido van Rossum wrote:

> On Thu, Mar 1, 2012 at 9:44 AM, Victor Stinner  
wrote:
> > frozendict would help pysandbox but also any security Python module,
> > not security, but also (many) other use cases ;-)
>
> Well, let's focus on the other use cases, because to me the sandbox
> use case is too controversial (never mind how confident you are :-).
>
> I like thinking through the cache use case a bit more, since this is a
> common pattern. But I think it would be sufficient there to prevent
> accidental modification, so it should be sufficient to have a dict
> subclass that overrides the various mutating methods: __setitem__,
> __delitem__, pop(), popitem(), clear(), setdefault(), update().

For the caching part, simply making the dictproxy type public would already 
help a lot.

> What other use cases are there?

dicts as keys or as set members. I do run into this from time to time and 
always get tuple(sorted(items()) or something like that.

nd
-- 
s  s^saoaaaoaaoaaaom  a  alataa  aaoat  a  a
a maoaa a laoata  a  oia a o  a m a  o  alaoooat aaool aaoaa
matooololaaatoto  aaa o a  o ms;s;\s;s;g;y;s;:;s;y#mailto: #
 \51/\134\137| http://www.perlig.de #;print;# > n...@perlig.de
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2006-03-09 Thread André Malo
* Paul Moore wrote:

> If it *is* possible, I'd say it's worth implementing at least a
> warning sooner rather than later - the practice seems questionable at
> best, and any progress towards outlawing it would help in work on
> optimising builtins.



FWIW, this practice is very handy for unit tests. For example, I often 
shadow builtins like ``file`` for my tests with a mock placed from the 
outside into the global namespace.

nd


-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping __init__.py requirement for subpackages

2006-04-26 Thread André Malo
* Guido van Rossum wrote:

> So I have a very simple proposal: keep the __init__.py requirement for
> top-level pacakages, but drop it for subpackages. This should be a
> small change. I'm hesitant to propose *anything* new for Python 2.5,
> so I'm proposing it for 2.6; if Neal and Anthony think this would be
> okay to add to 2.5, they can do so.

Not that it would count in any way, but I'd prefer to keep it. How would I 
mark a subdirectory as "not-a-package" otherwise?

echo "raise ImportError" >__init__.py

?

nd
-- 
Das, was ich nicht kenne, spielt stückzahlmäßig *keine* Rolle.

   -- Helmut Schellong in dclc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping __init__.py requirement for subpackages

2006-04-26 Thread André Malo
* Guido van Rossum wrote:

> On 4/26/06, André Malo <[EMAIL PROTECTED]> wrote:
> > * Guido van Rossum wrote:
> > > So I have a very simple proposal: keep the __init__.py requirement
> > > for top-level pacakages, but drop it for subpackages. This should be
> > > a small change. I'm hesitant to propose *anything* new for Python
> > > 2.5, so I'm proposing it for 2.6; if Neal and Anthony think this
> > > would be okay to add to 2.5, they can do so.
> >
> > Not that it would count in any way, but I'd prefer to keep it. How
> > would I mark a subdirectory as "not-a-package" otherwise?
>
> What's the use case for that? Have you run into this requirement? And
> even if you did, was there a requirement that the subdirectory's name
> be the same as a standard library module? If the subdirectory's name
> is not constrained, the easiest way to mark it as a non-package is to
> put a hyphen or dot in its name; if you can't do that, at least name
> it something that you don't need to import.

Actually I have no problems with the change from inside python, but from the 
POV of tools, which walk through the directories, collecting/separating 
python packages and/or supplemental data directories. It's an explicit vs. 
implicit issue, where implicit would mean "kind of heuristics" from now on. 
IMHO it's going to break existing stuff [1] and should at least not be done 
in such a rush.

nd

[1] Well, it does break some of mine ;-)
-- 
Da fällt mir ein, wieso gibt es eigentlich in Unicode kein
"i" mit einem Herzchen als Tüpfelchen? Das wär sooo süüss!

 -- Björn Höhrmann in darw
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping __init__.py requirement for subpackages

2006-04-26 Thread André Malo
* Guido van Rossum wrote:

[me]
> > Actually I have no problems with the change from inside python, but
> > from the POV of tools, which walk through the directories,
> > collecting/separating python packages and/or supplemental data
> > directories. It's an explicit vs. implicit issue, where implicit would
> > mean "kind of heuristics" from now on. IMHO it's going to break
> > existing stuff [1] and should at least not be done in such a rush.
> >
> > nd
> >
> > [1] Well, it does break some of mine ;-)

[Guido]
> Can you elaborate? You could always keep the __init__.py files, you
> know...

Okay, here's an example. It's about a non-existant __init__.py, though ;-)
I have a test system which collects the test suites from one or more 
packages automatically by walking through the tree. Now there are 
subdirectories which explicitly are not packages (no __init__.py), but do 
contain some python files (helper scripts, spawned for particular tests). 
The test collector doesn't consider now these subdirectories at all, but in 
future it would need to (because it should search like python itself).

Another point is that one can even hide supplementary packages within such a 
subdirectory. It's only visible to scripts inside the dir (I admit, that 
the latter is not a real usecase, just a thought that came up while writing 
this up).

Anyway: sure, one could tweak the naming - just not for existing a.k.a. 
already released stuff. It's not very nice to force that, too ;-)

nd
-- 
If God intended people to be naked, they would be born that way.
  -- Oscar Wilde
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com