from:"Toshio Kuratomi"

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-18 Thread Toshio Kuratomi

On Wed, Mar 18, 2015 at 12:22:03PM -0400, Barry Warsaw wrote:
> On Mar 18, 2015, at 03:46 PM, Orion Poplawski wrote:
> 
> >We're starting a discussion in Fedora about setting the default shbang for
> >system python executables and/or daemons to python -s or python -Es (or ?).
> 
> We've talked about this in Debian/Ubuntu land and the general consensus is
> that for Python 2, use -Es and for Python 3 use -I (which implies -Es).  I'm
> not sure we're consistent yet in making sure our build tools install these
> switches in our shebangs, but I'm hoping after Jessie we can make some
> progress on that.
> 
Interesting, I've cautiously in favor of -s in Fedora but the more I've
thought about it the less I've liked -E.  It just seems like PYTHONPATH is
analagous to LD_LIBRARY_PATH for C programs and PATH for shell scripting.
We leave both of those for local admins and users to affect the behaviour of
programs if they needed to.

Was there some discussion of -E specifically in Debian where it was
consciously decided that PYTHONPATH was not analagous or that the benefit
risk was different than for those other env vars?

-Toshio


pgp5en9cOom3v.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-19 Thread Toshio Kuratomi

On Wed, Mar 18, 2015 at 2:56 PM, Barry Warsaw  wrote:
> On Mar 18, 2015, at 02:44 PM, Toshio Kuratomi wrote:
>
>>Interesting, I've cautiously in favor of -s in Fedora but the more I've
>>thought about it the less I've liked -E.  It just seems like PYTHONPATH is
>>analagous to LD_LIBRARY_PATH for C programs and PATH for shell scripting.
>>We leave both of those for local admins and users to affect the behaviour of
>>programs if they needed to.
>
> It is, and it isn't.  It's different because you can always explicitly
> override the shebang line if needed.  So if a local admin really needed to
> override $PYTHONPATH (though I can't come up with a use case right now), they
> could just:
>
> $ python3 -s /usr/bin/foo
>
I could see that as a difference.  However, the environment variables
give users the ability to change things globally whereas overriding
the shebang line is case-by-case  so it's not a complete replacement
of the functionality.

LD_LIBRARY_PATH can be used for things like logging all calls to a
specific function, applying a bugfix to a library when you don't have
root on the box, evaluating how a potential replacement for a system
library will affect the whole system, and other things that are
supposed to affect a large number of the files on the OS.  PYTHONPATH
can be used for the same purposes as long as -E is not embedded into
the shebang lines. (I suppose you could write a "python" wrapper
script that discards -E... but you'd need root on the box to install
your wrapper [since system packages are encouraged to use the full
path to python rather than env python] and the change would affect
everyone on the box rather than just the person setting the env var).

Using -E by default for all system applications would prevent that.
The benefit of -E is that it isolates the effects of PYTHONPATH to
only specific programs.  However, that benefit doesn't seem as
striking as it first appears (or at least, as it first appeared to me
:-).  Unix env vars have their own method of isolation: if the env var
is marked for export then it is sent to child processes.  If it is not
marked for export then it only affects the current process.  So it
seems like -E isn't adding something new; it's just protecting users
from themselves.  That seems contrary to "consenting adults" (although
distributions are separate entities from python-dev ;-).

What makes -s different from -E?  If you think about it in the context
of users being able to set PYTHONPATH to add libraries that can
override system packages then I think specifying -s for system
packages establishes a default behaviour: The user can override the
defaults but only if they change the environment.  Without -s, this
expectation is violated for libraries in the user site directory.
With -s, the user would have to add the user site directory to
PYTHONPATH if they want the libraries there to override system
packages.

So I guess I'm still leaning towards -E being the wrong choice for
Fedora but Fedora lives within a broader ecosystem of python-providing
distributions.  So I'm interested in seeing whether Debian thought
about these aspects when they decided on using -E or if that would
change anyone's minds and also what other distributions think about
adding -s and/or -E to their packaged applications' shebang lines.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-19 Thread Toshio Kuratomi

I think I've found the Debian discussion (October 2012):

http://comments.gmane.org/gmane.linux.debian.devel.python/8188

Lack of PYTHONWARNINGS was brought up late in the discussion thread
but I think the understanding that when a particular user sets an
environment variable they want it to apply to all scripts they run was
kind of lost in the followups (It wasn't directly addressed or
mentioned again.)

-Toshio

On Thu, Mar 19, 2015 at 12:27 PM, Barry Warsaw  wrote:
> On Mar 19, 2015, at 11:15 AM, Toshio Kuratomi wrote:
>
>>I could see that as a difference.  However, the environment variables
>>give users the ability to change things globally whereas overriding
>>the shebang line is case-by-case  so it's not a complete replacement
>>of the functionality.
>
> You make some good points.  I guess it's a trade-off between flexibility and a
> known secure execution environment.  I'm not sure there's a right answer;
> different admins might have valid different opinions.
>
> Cheers,
> -Barry
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/a.badger%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-23 Thread Toshio Kuratomi

-Toshio
On Mar 19, 2015 3:27 PM, "Victor Stinner"  wrote:
>
> 2015-03-19 21:47 GMT+01:00 Toshio Kuratomi :
> > I think I've found the Debian discussion (October 2012):
> >
> > http://comments.gmane.org/gmane.linux.debian.devel.python/8188
> >
> > Lack of PYTHONWARNINGS was brought up late in the discussion thread
>
> Maybe we need to modify -E or add a new option to only ignore PYTHONPATH.
>
I think pythonpath is still useful on its own.

Building off Nick's idea of a system python vs a python for users to use, I
would see a more useful modification to be able to specify SPYTHONPATH (and
other env vars) to go along with /usr/bin/spython .  That way the user
maintains the capability to override specific libraries globally just like
with LD_LIBRARY_PATH, PATH, and similar but you won't accidentally
configure your own python to use one set of paths for your five python apps
and then have that leak over and affect system tools.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-23 Thread Toshio Kuratomi

On Mon, Mar 23, 2015 at 03:30:23PM +0100, Antoine Pitrou wrote:
> On Mon, 23 Mar 2015 07:22:56 -0700
> Toshio Kuratomi  wrote:
> > 
> > Building off Nick's idea of a system python vs a python for users to use, I
> > would see a more useful modification to be able to specify SPYTHONPATH (and
> > other env vars) to go along with /usr/bin/spython .  That way the user
> > maintains the capability to override specific libraries globally just like
> > with LD_LIBRARY_PATH, PATH, and similar but you won't accidentally
> > configure your own python to use one set of paths for your five python apps
> > and then have that leak over and affect system tools.
> 
> I really think Donald has a good point when he suggests a specific
> virtualenv for system programs using Python.
> 
The isolation is what we're seeking but I think the amount of work required
and the added complexity for the distributions will make that hard to get
distributions to sign up for.

If someone had the time to write a front end to install packages into
a single "system-wide isolation unit" whose backend was a virtualenv we
might be able to get distributions on-board with using that.

The front end would need to install software so that you can still invoke
/usr/bin/system-application and "system-application" would take care of
activating the virtualenv.  It would need to be about as simple to build
as the present python2 setup.py build/install with the flexibility in
options that the distros need to install into FHS approved paths.  Some
things like man pages, locale files, config files, and possibly other data
files might need to be installed outside of the virtualenv directory.  Many
setup.py's already punt on some of those, though, letting the user choose
to install them manually.  So this might be similar.  It would need to be able
to handle 32bit and 64bit versions of the same library installed on the same
system.  It would need to be able to handle different versions of the same
library installed on the same system (as few of those as possible but it's
an unfortunate cornercase that can't be entirely ignored even for just
system packages).  It would need a mode where it doesn't use the network at
all; only operates with the packages and sources that are present on the
box.

And remember these two things: (1) we'd be asking the distros to do
a tremendous amount of work changing their packages to install into
a virtualenv instead of the python setup.py way that is well documented and
everyone's been using for ages.  it'll be a tough sell even with good
tooling.  (2) this theoretical front-end would have to appeal to the distro
maintainers so there would have to be a lot of talk to understand what
capabilities the distro maintainers would need from it.

-Toshio

pgp1JMWtlRGec.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use ptyhon -s as default shbang for system python executables/daemons

2015-03-23 Thread Toshio Kuratomi

On Mon, Mar 23, 2015 at 04:14:52PM +0100, Antoine Pitrou wrote:
> On Mon, 23 Mar 2015 08:06:13 -0700
> Toshio Kuratomi  wrote:
> > > 
> > > I really think Donald has a good point when he suggests a specific
> > > virtualenv for system programs using Python.
> > > 
> > The isolation is what we're seeking but I think the amount of work required
> > and the added complexity for the distributions will make that hard to get
> > distributions to sign up for.
> > 
> > If someone had the time to write a front end to install packages into
> > a single "system-wide isolation unit" whose backend was a virtualenv we
> > might be able to get distributions on-board with using that.
> 
> I don't think we're asking distributions anything. We're suggesting a
> possible path, but it's not python-dev's job to dictate distributions
> how they should package Python.
> 
> The virtualenv solution has the virtue that any improvement we might
> put in it to help system packagers would automatically benefit everyone.
> A specific "system Python" would not.
> 
> > The front end would need to install software so that you can still invoke
> > /usr/bin/system-application and "system-application" would take care of
> > activating the virtualenv.  It would need to be about as simple to build
> > as the present python2 setup.py build/install with the flexibility in
> > options that the distros need to install into FHS approved paths.  Some
> > things like man pages, locale files, config files, and possibly other data
> > files might need to be installed outside of the virtualenv directory.
> 
> Well, I don't understand what difference a virtualenv would make.
> Using a virtualenv amounts to invoking a different interpreter path.
> The rest of the filesystem (man pages locations, etc.) is still
> accessible in the same way. But I may miss something :-)
> 
  I think people who are saying "The system should just use
virtualenv" aren't realizing all of the reasons that's not as simple as it
sounds for distributions to implement.  thus the work required to implement
alternate solutions like a system python may seem less to the distros 
unless those issues are partially addressed at the virtualenv and
python-packaging level.

-Toshio


pgpVr_lDSoJ_Q.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] Python 3.x Adoption for PyPI and PyPI Download Numbers

2015-04-21 Thread Toshio Kuratomi

On Tue, Apr 21, 2015 at 01:54:55PM -0400, Donald Stufft wrote:
> 
> Anyways, I'll have access to the data set for another day or two before I
> shut down the (expensive) server that I have to use to crunch the numbers so 
> if
> there's anything anyone else wants to see before I shut it down, speak up 
> soon.
> 
Where are curl and wget getting categorized in the User Agent graphs?

Just morbidly curious as to whether they're in with Browser and therefore
mostly unused or Unknown and therefore only slightly less unused ;-)

-Toshio


pgpl68EPROSH6.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Proposal: go back to enabling DeprecationWarning by default

2017-11-07 Thread Toshio Kuratomi

On Nov 7, 2017 5:47 AM, "Paul Moore"  wrote:

On 7 November 2017 at 13:35, Philipp A.  wrote:
> Sorry, I still don’t understand how any of this is a problem.
>
> If you’re an application developer, google “python disable
> DeprecationWarning” and paste the code you found, so your users don’t see
> the warnings.
> If you’re a library developer, and a library you depend on raises
> DeprecationWarnings without it being your fault, file an issue/bug there.
>
> For super-increased convenience in case 2., we could also add a
convenience
> API that blocks deprecation warnings raised from certain module or its
> submodules.
> Best, Philipp

If you're a user and your application developer didn't do (1) or a
library developer developing one of the libraries your application
developer chose to use didn't do (2), you're hosed. If you're a user
who works in an environment where moving to a new version of the
application is administratively complex, you're hosed.

As I say, the proposal prioritises developer convenience over end user
experience.

I don't agree with this characterisation.  Even if we assume a user isn't
going to fix a DeprecationWarning they still benefit: (1) if they're a
sysadmin it will warn them that they need to be careful when upgrading a
dependency. (2) if the developer never hears about the DeprecationWarning
then it is ultimately the user who suffers when the tool they depend on
breaks without warning so seeing and reporting the DeprecationWarning helps
the end user. (3) if DeprecationWarnings are allowed to linger through
multiple releases, it may tell the user about the quality of the software
they're using.

More information is helpful to end users.  Developers are actually the ones
that it inconveniences as we'll be the ones grumbling when an end user who
hasn't evaluated the deprecation cycles of upstream projects as we have
demand immediate changes for deprecations that are still years away from
causing problems.  But unlike end users, we do have the ability to solve
that by turning those deprecations off in our code if we've done our due
diligence (or even if we haven't done our due diligence).

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-10 Thread Toshio Kuratomi

On Dec 9, 2017 8:53 PM, "INADA Naoki"  wrote:

> Earlier versions of PEP 538 thus included "en_US.UTF-8" on the
> candidate target locale list, but that turned out to cause assorted
> problems due to the "C -> en_US" part of the coercion.

Hm, but PEP 538 says:

> this PEP instead proposes to extend the "surrogateescape" default for
stdin and stderr error handling to also apply to the three potential
coercion target locales.

https://www.python.org/dev/peps/pep-0538/#defaulting-to-
surrogateescape-error-handling-on-the-standard-io-streams

I don't think en_US.UTF-8 should use surrogateescape error handler.


Could you explain why not? utf-8 seems like the common thread for using
surrogateescape so I'm not sure what would make en_US.UTF-8 different than
C.UTF-8.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Deprecate PEP 370 Per user site-packages directory?

2018-01-13 Thread Toshio Kuratomi

On Jan 13, 2018 9:08 AM, "Christian Heimes"  wrote:

Hi,

PEP 370 [1] was my first PEP that got accepted. I created it exactly one
decade and two days ago for Python 2.6 and 3.0.


I didn't know I had you to thank for this!  Thanks Christian!  This is one
of the best features of the python software packaging ecosystem!  I almost
exclusively install into user site packages these days.  It lets me pull in
the latest version of software when I want it for everyday use and revert
to what my system shipped with if the updates break something.  It's let me
I install libraries ported to python3 before my distro got stopping to
packaging the updates.  It's let me perform an install when I want to test
my packages as my users might be using it without touching the system
dirs.  It's been a godsend!


Fast forward 10 years...

Nowadays Python has venv in the standard library. The user-specific
site-packages directory is no longer that useful. I would even say it's
causing more trouble than it's worth. For example it's common for system
script to use "#!/usr/bin/python3" shebang without -s or -I option.


With great power comes great responsibility...

Sure, installing something into user site packages can break system
scripts.  But it can also fix them.  I can recall breaking system scripts
twice by installing something into user site packages (both times, the
tracebacks rapidly lead me to the reason that the scripts were failing).
As a counter point to that I can recall *fixing* problems in system scripts
by installing newer libraries into site packages twice in the last two
months.  (I've also fixed system software by installing into user and then
modifying that version but I do that less frequently... Perhaps only a
couple times a year...)

Removing the user site packages also doesn't prevent people from making
local changes that break system scripts (removing the pre-configuration of
user site packages does not stop honoring usage of PYTHONPATH); it only
makes people work a little harder to place their overridden packages into a
location that python will find and leads to nonstandard locations for these
overrides. This will make it harder for people to troubleshoot the problems
other people may be having.  Instead of asking "do you have any libraries
in .local in your tracebacks?"  as an easy first troubleshooting step.
Without the user site packages standard we'll be back to trying to
determine which directories are official for the user's install and then
finding any local directories that their site may have defined for
overrides

I propose to deprecate the feature and remove it in Python 4.0.


Although I don't like the idea of system scripts adding -s and -l because
it prevents me from fixing them for my own use by installing just a newer
or modified library into user site packages (similar to how c programs can
use overridden libraries via ld_library_path), it seems that if you want to
prevent users from choosing to use their own libraries with system scripts,
the right thing to do is to get changes to allow adding those to setuptools
and distutils.  Those flags will do a much more thorough job of preventing
this usage than removing user site packages can.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Toshio Kuratomi

On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:

> What are the obstacles to including "preloaded" objects in regular .pyc
> files, so that everyone can take advantage of this without rebuilding the
> interpreter?
>

Would this make .pyc files arch specific?

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] A fast startup patch (was: Python startup time)

2018-05-05 Thread Toshio Kuratomi

On Sat, May 5, 2018, 10:40 AM Eric Fahlgren  wrote:

> On Sat, May 5, 2018 at 10:30 AM, Toshio Kuratomi 
> wrote:
>
>> On Fri, May 4, 2018, 7:00 PM Nathaniel Smith  wrote:
>>
>>> What are the obstacles to including "preloaded" objects in regular .pyc
>>> files, so that everyone can take advantage of this without rebuilding the
>>> interpreter?
>>>
>>
>> Would this make .pyc files arch specific?
>>
>
> Or have parallel "pyh" (Python "heap") files, that are architecture
> specific... (But that would cost more stat calls.)
>

I ask because arch specific byte code files are a big change in consumers
expectations.  It's not necessarily a bad change but it should be
communicated to downstreams so they can decide how to adjust to it.

Linux distros which ship byte code files will need to build them for each
arch, for instance.  People who ship just the byte code as an obfuscation
of the source code will need to decide whether to ship packages for each
arch they care about or change how they distribute.

-Toshio

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-05-30 Thread Toshio Kuratomi

On May 30, 2015 1:56 AM, "Nick Coghlan"  wrote:
>
> Being ready, willing and able to handle the kind of situation created
> by the Python 2->3 community transition is a large part of what it
> means to offer commercial support for community driven open source
> projects, as it buys customers' time for either migration technologies
> to mature to a point where the cost of migration drops dramatically,
> for the newer version of a platform to move far enough ahead of the
> legacy version for there to be a clear and compelling business case
> for forward porting existing software, or (as is the case we're aiming
> to engineer for Python), both.
>
Earlier, you said that it had been a surprise that people were against this
change.  I'd just point out that the reason is bound up in what you say
here.  Porting performance features from python 3 to python 2 has the
disadvantage of cutting into a compelling business case for users to move
forward to python 3.[1]  so doing this has a cost to python 3 adoption.
But, the question is whether there is a benefit that outweighs that cost.
I think seeing more steady, reliable contributors to python core is a very
large payment.  Sure, for now that payment is aimed at extending the legs
on the legacy version of python but at some point in the future python 2's
legs will be well and truly exhausted.  When that happens both the
developers who have gained the skill of contributing to cpython and the
companies who have invested money in training people to be cpython
contributors will have to decide whether to give up on all of that or
continue to utilize those skills and investments by bettering python 3.
I'd hope that we can prove ourselves a welcoming enough community that
they'd choose to stay.

-Toshio

[1] In fact, performance differences are a rather safe way to build
compelling business cases for forwards porting.  Safe because it is a
difference (unlike api and feature differences) that will not negatively
affect your ability to incrementally move your code to python 3.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Fwd: Request for pronouncement on PEP 493 (HTTPS verification backport guidance)

2015-11-24 Thread Toshio Kuratomi

On Nov 24, 2015 6:28 AM, "Laura Creighton"  wrote:
>
> In a message of Tue, 24 Nov 2015 14:05:53 +, Paul Moore writes:
> >Simply adding "people who have no control over their broken
> >infrastructure" with a note that this PEP helps them, would be
> >sufficient here (and actually helps the case for the PEP, so why not?
> >;-))
>
> But does it help them?  Or does it increase the power of those who
> hand out certificates and who are intensely security conscious over
> those who would like to get some work done this afternoon?
>
My reading is that it will help more people but lockdown environments can
still trump their users if they wish.

If a distribution wishes to give users of older python versions the option
of verifying certificates then they will need to backport changes
authorized by previous peps.  By themselves, those changes would make it so
environment owners and application authors are in complete control.  If an
application is coded to do cert verification and the remote end has
certificates that aren't recognized as valid on the client end then the
user would have to change the client application code to be able to use it
in their environment (or figure out how to get the ca for the remote end
into their local certificate store... in extreme cases, this might be
impossible - the ca cert has been lost or belongs to another company).

This pep tells distributions how they might give the client end a bit more
power when they backport.  The settings file allows the client to toggle
verification site wide.  The environment variable allows clients to toggle
it per application invocation.  Both of these situations are better for a
client than having the backport and nothing else.  Both of these can be
shut down by an environment owner with sufficient authority to limit what's
running on the client (not sure the scope of the environment owner's powers
here so I thought I should acknowledge this factor).

So basically: backporting other peps (to increase security) will subtract
power from the clients.  This pep specifies several facilities the
backporters can implement to give some of that power back to the clients.

-Toshio

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Request for pronouncement on PEP 493 (HTTPS verification backport guidance)

2015-11-24 Thread Toshio Kuratomi

On Mon, Nov 23, 2015 at 5:59 PM, Barry Warsaw  wrote:

> I'm concerned about accepting PEP 493 making a strong recommendation to
> downstreams.  Yes, in an ideal world we all want security by default, but I
> think the backward compatibility concerns of the PEP are understated,
> especially as they relate to a maintenance release of a stable long term
> support version of the OS.  I don't want PEP 493 to be a cudgel that people
> beat us up with instead of having an honest discussion of the difficult
> trade-offs involved.
>
It sounds like the implementation sections of the PEP are acceptable
but that the PEP's general tone seems to assume that distributors are
champing at the bit to backport and that the recommendations here make
it so that backporting is a no-brainer -- which does not seem to
reflect the real-world?

I think the tone could be changed to address that as it doesn't seem
like forcing distros to backport is a real goal of the PEP.  The main
purposes of the PEP seem to be:

* Enumerate several ways that distributors can backport these 2.7.9
features to older releases
* Allow programmers to detect the presence of the features from their code
* Give end-users the ability to choose between backwards compatibility
and enhanced security

Here's some ideas for changing the tone:

  Abstract

  PEP 476 updated Python's default handling of HTTPS certificates to be
  appropriate for communication over the public internet. The Python 2.7 long
  term maintenance series was judged to be in scope for this change, with the
  new behaviour introduced in the Python 2.7.9 maintenance release.

+ Change to "PEP 476 updated Python's default handling of HTTPS
certificates to validate that the certs belonged to the server".  This
way we're saying what the change is rather than making a value
judgement of whether people who don't choose to backport are
"appropriate" or not.  Appropriate-ness is probably best left as an
argument in the text of PEP 476.

  This PEP provides recommendations to downstream redistributors wishing to
  provide a smoother migration experience when helping their users to manage
  this change in Python's default behaviour.

+ Change to "downstream redistributors wishing to backport the
enhancements in a way that allows users to choose between backwards
compatible behaviour or more secure certificate handling."  As barry
noted, this PEP doesn't change the amount of work needed to migrate.
It does, however, give users some choice in when they are going to
perform that work.  Additionally, this isn't simply about distributors
who want to make the transition smoother... (there's no downstreams
that want to make it "more painful" are there? ;-)  It's really about
making backporting of the enhancements less painful for users.

  Rationale
  =

  PEP 476 changed Python's default behaviour to better match the needs and
  expectations of developers operating over the public internet, a category
  which appears to include most new Python developers. It is the position of
  the authors of this PEP that this was a correct decision.

  However, it is also the case that this change *does* cause problems for
  infrastructure administrators operating private intranets that rely on
  self-signed certificates, or otherwise encounter problems with the new default
  certificate verification settings.

+ per barry's mesage, it would be good to either devote a paragraph to
the backwards compatibility implications here or link to
https://www.python.org/dev/peps/pep-0476/#backwards-compatibility

  The long term answer for such environments is to update their internal
  certificate management to at least match the standards set by the public
  internet, but in the meantime, it is desirable to offer these administrators
  a way to continue receiving maintenance updates to the Python 2.7 series,
  without having to gate that on upgrades to their certificate management
  infrastructure.

+ The wording here seems to reflect a different scope than merely
backporting by distros.  Perhaps we should change it to: "[...]set by
the public internet.  Distributions may wish to help these sites
transition by backporting the PEP 476 changes to earlier versions of
python in a way that does not require the administrators to upgrade
their certificate management infrastructure immediately.  This would
allow sites to choose to use the distribution suppiied python in a
backwards compatible fashion until their certificate management
infrastructure was updated and then toggle their site to utilize the
more secure features provided by PEP 476."

[...]

  These designs are being proposed as a recommendation for
redistributors, rather
  than as new upstream features, as they are needed purely to support legacy
  environments migrating from older versions of Python 2.7. Neither approach
  is being proposed as an upstream Python 2.7 feature, nor as a feature in any
  version of Python 3 (whether published directly by the

Re: [Python-Dev] Request for pronouncement on PEP 493 (HTTPS verification backport guidance)

2015-11-24 Thread Toshio Kuratomi

On Tue, Nov 24, 2015 at 10:08 AM, Paul Moore  wrote:

> I'm not actually sure that it's the place of this PEP to even comment
> on what the long term answer for such environments should be (or
> indeed, any answer, long term or not). I've argued the position that
> in some organisations it feels like security don't actually understand
> the issues of carefully balancing secure operation against flexible
> development practices,

I agree with this.

> but conversely it's certainly true that in many
> organisations, they *have* weighed the various arguments and made an
> informed decision on how to set up their internal network. It's
> entirely possible that self-signed certificates are entirely the right
> decision for their circumstances. Why would a Python PEP be qualified
> to comment on that decision?

I don't quite agree with this but it's probably moot in the face of
the previous and certain cornercases.  Self-signed certs work just
fine with the backports to python-2.7.9+ but you have to add the ca to
the clients.  An organization that has weighed the arguments and made
an informed decision to use self-signed certs should either do this
(to prevent MITM) or they should switch to using http instead of https
(because MITM isn't a feasible attack here).  The cornercases come
into play because you don't always control all of the devices and
services on your network.  The site could evaluate and decide that
MITM isn't a threat to their switch's configuration interface but that
interface might be served over https using a certificate signed by
their network vendor who doesn't give out their ca certificate (simply
stated: your security team knows what they're doing but your vendor's
does not).

> In my opinion, we should take all of the value judgements out of this
> paragraph, and just state the facts. How about:
>
> """
> In order to provide additional flexibility to allow infrastructure
> administrators to provide the appropriate solution for their
> environment, this PEP offers a way for administrators to upgrade to
> later versions of the Python 2.7 series without being forced to update
> their existing security certificate management infrastructure as a
> prerequisite.
> """

Two notes:

* python-2.7.9+ doesn't give you flexibility in this regard so
organizations do have to update their certificate management
infrastructure.  The cornercase described above becomes something that
has to be addressed at the code level.  Environments that are simply
misconfigured have to be fixed.  So in that regard, a value judgement
does seem appropriate here.  the judgement is "Listen guys, this PEP
advises redistributors on how they might provide a migration path for
you but it *does not bandaid the problem indefinitely*.  So long term,
you have to change your practices or you'll be out in the cold when
your redistributor upgrades to python-2.7.9+"

* Your proposed text actually removes the fix that I was adding --
this version of the paragraph implies that if your environment is
compatible with the redistributors' python-2.7.8 (or less) then it
will also be compatible with the redistributors' python-2.7.9+.  That
is not true.  Whether or not we take out any value judgement as to the
user's present environment this paragraph needs to be fixed to make it
clear that this only affects redistributor's packages which have
backported pep 476 to python-2.7.8 or older.  Once the redistributor
updates to a newer python sites which relied on this crutch will
break.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Request for pronouncement on PEP 493 (HTTPS verification backport guidance)

2015-11-24 Thread Toshio Kuratomi

On Tue, Nov 24, 2015 at 10:56 AM, Paul Moore  wrote:
> On 24 November 2015 at 18:37, Toshio Kuratomi  wrote:

>> The cornercases come
>> into play because you don't always control all of the devices and
>> services on your network.  The site could evaluate and decide that
>> MITM isn't a threat to their switch's configuration interface but that
>> interface might be served over https using a certificate signed by
>> their network vendor who doesn't give out their ca certificate (simply
>> stated: your security team knows what they're doing but your vendor's
>> does not).
>
> This sounds like a similar situation to what I described above. I'm
> not sure I'd see these as corner cases, though - they are pretty much
> day to day business in my experience :-(
>
It sounds like you're coming from a Windows background and I'm coming
from a Linux background which might be a small disconnect here -- we
do seem to be in agreement that what's "right to do" isn't always easy
or possible for the client to accomplish so I think we should probably
leave it at that.

>>> In my opinion, we should take all of the value judgements out of this
>>> paragraph, and just state the facts. How about:
>>>
>>> """
>>> In order to provide additional flexibility to allow infrastructure
>>> administrators to provide the appropriate solution for their
>>> environment, this PEP offers a way for administrators to upgrade to
>>> later versions of the Python 2.7 series without being forced to update
>>> their existing security certificate management infrastructure as a
>>> prerequisite.
>>> """
>>
>> Two notes:
>>
>> * python-2.7.9+ doesn't give you flexibility in this regard so
>> organizations do have to update their certificate management
>> infrastructure.  The cornercase described above becomes something that
>> has to be addressed at the code level.  Environments that are simply
>> misconfigured have to be fixed.  So in that regard, a value judgement
>> does seem appropriate here.  the judgement is "Listen guys, this PEP
>> advises redistributors on how they might provide a migration path for
>> you but it *does not bandaid the problem indefinitely*.  So long term,
>> you have to change your practices or you'll be out in the cold when
>> your redistributor upgrades to python-2.7.9+"
>
> Hmm, maybe I misread the PEP (I only skimmed it - as I say, Linux is
> of limited interest to me). I thought that the environment variable
> gave developers a "get out" clause. Maybe it's not what we want them
> to do (for some value of "we") but isn't that the point of the PEP?
>
> Admittedly if distributions don't *implement* that part of the PEP
> (and I understand Red Hat haven't) then people are still stuck. But
> "this PEP offers a way" is not incompatible with "your vendor didn't
> implement the PEP so you're still stuck, sorry"...
>

Yeah, I think you are correct in your understanding of what actual
changes to the python distrribution are being proposed for
redistributors in the PEP.  Reading through the PEP again, I'm not
sure if I'm correct in thinking that this only applies to
backporting... it may be that the environment section of the PEP
applies to any python-2 while the config file section only applies to
backporting.  Nick, could you clarify?

The PEP is clear that it doesn't apply to python-3 or cross-distro.
So that means that sites still can't rely on this long-term (but their
long term would extend to the lifetime of their vendor supporting
python2 rather than when their vendor updated to 2.7.9+) and also that
developers can't depend on this if they're developing portable code.

>> * Your proposed text actually removes the fix that I was adding --
>> this version of the paragraph implies that if your environment is
>> compatible with the redistributors' python-2.7.8 (or less) then it
>> will also be compatible with the redistributors' python-2.7.9+.  That
>> is not true.  Whether or not we take out any value judgement as to the
>> user's present environment this paragraph needs to be fixed to make it
>> clear that this only affects redistributor's packages which have
>> backported pep 476 to python-2.7.8 or older.  Once the redistributor
>> updates to a newer python sites which relied on this crutch will
>> break.
>
> Sorry for that. Certainly getting the facts right is crucial, and it
> looks like my suggestion didn't do that. But hopefully someone can fix
> it up (if people think it's a good way to go).

Could be that I'm wrong -- will wait for Nick to clarify before I
think about what could be done to make this wording better.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Request for pronouncement on PEP 493 (HTTPS verification backport guidance)

2015-11-26 Thread Toshio Kuratomi

On Nov 26, 2015 4:53 PM, "Nick Coghlan"  wrote:
>
> On 27 November 2015 at 03:15, Barry Warsaw  wrote:

>
> > Likewise in Ubuntu, we try to keep deviations from Debian at a minimum,
and
> > document them when we must deviate.  Ubuntu is a community driven
distro so
> > while Canonical itself has customers, it's much more likely that
feedback
> > about the Python stack comes from ordinary users.  Again, my personal
goal is
> > to make Python on Ubuntu a pleasant and comfortable environment, as
close to
> > installing from source as possible, consistent with the principles and
> > policies of the project.
>
> I'd strongly agree with that description for Fedora and
> softwarecollections.org, but for the RHEL/CentOS system Python I think
> the situation is slightly different: there, the goal is to meet the
> long term support commitments involved in being a base RHEL package.
> As the nominal base version of the package (2.7.5 in the case of RHEL
> 7) doesn't change, there is naturally going to be increasing
> divergence from the nominal version.

I think the goal in rhel/centos is similar, actually.  The maintenance
burden for non upstream changes has been acknowledged as a problem to be
avoided by rhel maintainers before.  The caveat for those distributions is
that they accumulate more *backports*.

However, backports are easier to maintain than non upstream changes.  The
test of the upstream community helps to find and fix bugs in the code; the
downstream maintainer just needs to stay aware of whether fixes are going
into the code they've backported.

> I tried to go down the "upstream first" path with a properly supported
> "off switch" in PEP 476, and didn't succeed (hence the monkeypatch
> compromise). It sounds like several folks would like to see us revisit
> that decision, though.
>
That's the rub.  If there's now enough support to push this upstream I
think everyone downstream will be happier.  If it turns out there's still
enough resistance to keep it from upstream then I suppose you cross that
bridge if it becomes necessary.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 to be default in Fedora 22

2013-10-25 Thread Toshio Kuratomi

On Fri, Oct 25, 2013 at 01:32:36PM +1000, Nick Coghlan wrote:
> 
> On 25 Oct 2013 09:02, "Terry Reedy"  wrote:
> 
> > http://lwn.net/Articles/571528/
> > https://fedoraproject.org/wiki/Changes/Python_3_as_Default
> 
> Note that unlike Arch, the Fedora devs currently plan to leave "/usr/bin/
> python" referring to Python 2 (see the "User Experience" part of the 
> proposal).
> 

The tangible changes for this are just that we're hoping to only have
python3, not python2 on our default LiveCD and cloud images.  This has been
a bit hard since many of our core packaging tools (and the large number of
release engineering, package-maintainer, distro installer, etc scripts built
on top of them) were written in python2.  The F22 release is hoping to have
a set of C libraries for those tools with both python3 and python2 bindings.
That will hopefully allow us to port the user-visible tools (installer and
things present on the selected images) to python3 for F22 while leaving the
release-engineering and packager-oriented scripts until a later Fedora
release.

-Toshio

pgpr8jz4t1Ec9.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-07 Thread Toshio Kuratomi

On Tue, Jan 07, 2014 at 09:26:20PM +0900, Stephen J. Turnbull wrote:
> Is this really a good idea?  PEP 460 proposes rather different
> semantics for bytes.format and the bytes % operator from the str
> versions.  I think this is going to be both confusing and a continuous
> target for "further improvement" until the two implementations
> converge.
>

Reading about the proposed differences reminded me of how in older python2
versions unicode() took keyword arguments but str.decode() only took
positional arguments.  I squashed a lot of trivial bugs in people's code
where that difference wasn't anticpated.  In later python2 versions both of
those came to understand how to take their arguments as keywords which saved
me from further unnecessary pain.

-Toshio

pgpuZ4S1f5GEP.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inclusion of lz4 bindings in stdlib?

2018-11-29 Thread Toshio Kuratomi

On Thu, Nov 29, 2018, 6:56 AM Benjamin Peterson 
>
> On Wed, Nov 28, 2018, at 15:27, Steven D'Aprano wrote:
> > On Wed, Nov 28, 2018 at 10:43:04AM -0800, Gregory P. Smith wrote:
> >
> > > PyPI makes getting more algorithms easy.
> >
> > Can we please stop over-generalising like this? PyPI makes getting
> > more algorithms easy for *SOME* people. (Sorry for shouting, but you
> > just pressed one of my buttons.)
> >
> > PyPI might as well not exist for those who cannot, for technical or
> > policy reasons, install addition software beyond the std lib on the
> > computers they use. (I hesitate to say "their computers".)
> >
> > In many school or corporate networks, installing unapproved software can
> > get you expelled or fired. And getting approval may be effectively
> > impossible, or take months of considerable effort navigating some
> > complex bureaucratic process.
> >
> > This is not an argument either for or against adding LZ4, I have no
> > opinion either way. But it is a reminder that "just get it from PyPI"
> > represents an extremely privileged position that not all Python users
> > are capable of taking, and we shouldn't be so blase about abandoning
> > those who can't to future std lib improvements.
>
> While I'm sympathetic to users in such situations, I'm not sure how much
> we can really help them. These are the sorts of users who are likely to
> still be stuck using Python 2.6. Any stdlib improvements we discuss and
> implement today are easily a decade away from benefiting users in
> restrictive environments. On that kind of timescale, it's very hard to know
> what to do, especially since, as Paul says, we don't hear much feedback
> from such users.
>

As a developer of software that has to run in such environments, having a
library be in the stdlib is helpful as it is easier to convince the rest of
the team to bundle a backport of something that's in a future stdlib than a
random package from pypi.  Stdlib inclusion gives the library a known
future and a (perhaps illusory, perhaps real) blessing from the core devs
that helps to sell the library as the preferred solution.

-Toshio

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Compile-time resolution of packages [Was: Another update for PEP 394...]

2019-02-27 Thread Toshio Kuratomi

On Tue, Feb 26, 2019 at 2:07 PM Neil Schemenauer 
wrote:

> On 2019-02-26, Gregory P. Smith wrote:
> > On Tue, Feb 26, 2019 at 9:55 AM Barry Warsaw  wrote:
> > For an OS distro provided interpreter, being able to restrict its use to
> > only OS distro provided software would be ideal (so ideal that people who
> > haven't learned the hard distro maintenance lessons may hate me for it).
>
> This idea has some definite problems.  I think enforcing it via convention
is about as much as would be good to do.  Anything more and you make it
hard for people who really need to use the vendor provided interpreter from
being able to do so.

Why might someone need to use the distro provided interpreter?

* Vendor provides some python modules in their system packages which are
not installable from pip (possibly even a proprietary extension module, so
not even buildable from source or copyable from the system location) which
the end user needs to use to do something to their system.
* End user writes a python module which is a plugin to a system tool which
has to be installed into the system python to from which that system tool
runs.  The user then wants to write a script which uses the system tool
with the plugin in order to do something to their system outside of the
system tool (perhaps the system tool is GUI-driven and the user wants to
automate a part of it via the python module).  They need their script to
use the system python so that they are using the same code as the system
tool itself would use.

There's probably other scenarios where the benefits of locking the user out
of the system python outweigh the benefits but these are the ones that I've
run across lately.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Toshio Kuratomi

On Mon, Aug 5, 2019 at 6:47 PM  wrote:
>
> I wish people with more product management experience would chime in; 
> otherwise, 3.8 is going to ship with an intentional hard-to-ignore annoyance 
> on the premise that we don't like the way people have been programming and 
> that they need to change their code even if it was working just fine.
>

I was resisting weighing in since I don't know the discussion around
deprecating this language feature in the first place (other than
what's given in this thread).  However, in the product I work on we
made a very similar change in our last release so I'll throw it out
there for people to take what they will from it.

We have a long standing feature which allows people to define groups
of hosts and give them a name.  In the past that name could include
dashes, dots, and other characters which are not legal as Python
identifiers.  When users use those group names in our "DSL" (not truly
a DSL but close enough), they can do it using either dictionary-lookup
syntax (groupvars['groupname']) or using dotted attribute notation
groupvars.groupname.  We also have a longstanding problem where users
will try to do something like groupvars.group-name using the
dotted attribute notation with group names that aren't proper python
identifiers.  This causes problems as the name then gets split on the
characters that aren't legal in identifiers and results in something
unexpected (undefined variable, an actual subtraction operation, etc).
In our last release we decided to deprecate and eventually make it
illegal to use non-python-identifiers for the group names.

At first, product management *did* let us get away with this.  But
after some time and usage of the pre-releases, they came to realize
that this was a major problem.  User's had gotten used to being able
to use these characters in their group names.  They had defined their
group names and gotten used to typing their group names and built up a
whole body of playbooks that used these group names

Product management still let us get away with this.. sort of. The
scope of the change was definitely modified.  Users were now allowed
to select whether invalid group names were disallowed (so they could
port their installations), allowed with a warning (presumably so they
could do work but also see that they were affected) or allow without a
warning (presumably because they knew not to use these group names
with dotted attribute notation) .  This feature was also no longer
allowed to be deprecated... We could have a warning that said "Don't
do this" but not remove the feature in the future.

Now... I said this was a config option So what we do have in the
release is that the config option allows but warns by default and *the
config option* has a deprecation warning.  You see... we're planning
on changing from warn by default now to disallowing by default in the
future so the deprecation is flagging the change in config value.

And you know what?  User's absolutely hate this.  They don't like the
warning.  They don't like the implication that they're doing something
wrong by using a long-standing feature.  They don't like that we're
going to change the default so that they're current group names will
break.  They dislike that it's being warned about because of
attribute-lookup-notation which they can just learn not to use with
their group names.  They dislike this so much that some of us have
talked about abandoning this idea... instead, having a public group
name that users use when they write in the "DSL" and an internal group
name that we use when evaluating the group names. Perhaps that works,
perhaps it doesn't, but I think that's where my story starts being
specific to our feature and no longer applicable to Python and escape
sequences

Now like I said, I don't know the discussions that lead to invalid
escape sequences being deprecated so I don't know whether there's more
compelling reasons for doing it but it seems to me that there's even
less to gain by doing this than what we did in Ansible.  The thing
Ansible is complaining about can do the wrong thing when used in
conjunction with certain other features of our "DSL".  The thing that
the python escape sequences is complaining about are never invalid (As
was pointed out, it's complaining when a sequence of two characters
will do what the user intended rather than complaining when a sequence
of two characters will do something that the user did not intend).
Like the Ansible feature, though, the problem is that over time we've
discovered that it is hard to educate users about the exact
characteristic of the feature (\k == k but \n == newline;
groupvars['group-name']  works but groupvars.group-name does not) so
we've both given up on continuing to educate the users in favor of
attempting to nanny the user into not using the feature.  That most
emphatically has not worked for us and has spent a bunch of goodwill
with our users but the python userbase is not nece

Re: [Python-Dev] Bilingual scripts

2013-05-27 Thread Toshio Kuratomi

On Sat, May 25, 2013 at 05:57:28PM +1000, Nick Coghlan wrote:
> On Sat, May 25, 2013 at 5:56 AM, Barry Warsaw  wrote:
> > Have any other *nix distros addressed this, and if so, how do you solve it?
> 
> I believe Fedora follows the lead set by our own makefile and just
> appends a "3" to the script name when there is also a Python 2
> equivalent (thus ``pydoc3`` and ``pyvenv``). (I don't have any other
> system provided Python 3 scripts on this machine, though)
> 

Fedora is a bit of a mess... we try to work with upstream's intent when
upstream has realized this problem exists and have a single standard when
upstream does not.  The full guidelines are here:

http://fedoraproject.org/wiki/Packaging:Python#Naming

Here's the summary:

* If the scripts don't care whether they're running on py2 or py3, just use
  the base name and choose python2 as the interpreter for now (since we
  can't currently get rid of python2 on an end user system, that is the
  choice that brings in less dependencies).  ex: /usr/bin/pygmentize

* If the script does two different things depending on python2 or python3
  being the interpreter (note: this includes both bilingual scripts and
  scripts which have been modified by 2to3/exist in two separate versions)
  then we have to look at what upstream is doing:

- If upstream already deals with it (ex: pydoc3, easy_install-3.1) then we
use upstream's name.  We don't love this from an inter-package
consistently standpoint as there are other packages which append a
version for their own usage (is /usr/bin/foo-3.4 for python-3.4 or the
3.4 version of the foo package?) (And we sometimes have to do this
locally if we need to have multiple versions of a package with the
multiple versions having scripts... )  We decided to use upstream's name
if they account for this issue because it will match with upstream's
documentation and nothing else seemed as important in this instance.

- If upstream doesn't deal with it, then we use a "python3-" prefix.  This
matches with our package naming so it seemed to make sense.  (But
Barry's point about locate and tab completion and such would be a reason
to revisit this... Perhaps standardizing on /usr/bin/foo2-python3
[pathological case of having both package version and interpreter
version in the name.]

  - (tangent from a different portion of this thread: we've found that this
is a larger problem than we would hope.  There are some obvious ones
like
- ipython (implements a python interpreter so python2 vs python3 is
  understandably important ad different). 
- nosetests (the python source being operated on is run through the
  python interpreter so the version has to match).
- easy_install (needs to install python modules to the correct
  interpreter's site-packages.  It decides the correct interpreter
  according to which interpreter invoked it.)

But recently we found a new class of problems:  frameworks which are
bilinugual.  For instance, if you have a web framework which has a
/usr/bin/django-admin script that can be used to quickstart a
project, run a python shell and automatically load your code, load your
ORM db schema and operate on it to make modifications to the db then
that script has to know whether your code is compatible with python2 or
python3.


> > It would be nice if we could have some cross-platform recommendations so
> > things work the same wherever you go.  To that end, if we can reach some
> > consensus, I'd be willing to put together an informational PEP and some
> > scripts that might be of general use.
> 
> It seems to me the existing recommendation to use ``#!/usr/bin/env
> python`` instead of referencing a particular binary already covers the
> general case. The challenge for the distros is that we want a solution
> that *ignores* user level virtual environments.
> 
> I think the simplest thing to do is just append the "3" to the binary
> name (as we do ourselves for pydoc) and then abide by the
> recommendations in PEP 394 to reference the correct system executable.
> 
I'd rather not have a bare 3 for the issues notes above.  Something like py3
would be better.

There's still room for confusion when distributions have to push multiple
versions of a package with scripts that fall into this category.  Should the
format be:

/usr/bin/foo2-py3  (My preference as it places the version next to the
thing that it's a version of.)

or

/usr/bin/foo-py3-2  (Confusing as the 2 is bare.  Something like
/usr/bin/foo-py3-v2 is slightly better but still not as nice as the
previous IMHO)

-Toshio


pgpOcm8nDJ4cG.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bilingual scripts

2013-05-28 Thread Toshio Kuratomi

On Tue, May 28, 2013 at 01:22:01PM -0400, Barry Warsaw wrote:
> On May 27, 2013, at 11:38 AM, Toshio Kuratomi wrote:
> 
> >- If upstream doesn't deal with it, then we use a "python3-" prefix.  This
> >matches with our package naming so it seemed to make sense.  (But
> >Barry's point about locate and tab completion and such would be a reason
> >to revisit this... Perhaps standardizing on /usr/bin/foo2-python3
> >[pathological case of having both package version and interpreter
> >version in the name.]
> 
> Note that the Gentoo example also takes into account versions that might act
> differently based on the interpreter's implementation.  So a -python3 suffix
> may not be enough.  Maybe now we're getting into PEP 425 compatibility tag
> territory.
> 
  This is an interesting, unmapped area in Fedora at the moment... I
was hoping to talk to Nick and the Fedora python maintainer at our next
Fedora conference.

I've been looking at how Fedora's ruby guidelines are implemented wrt
alternate interpreters and wondering if we could do something similar for
python:

https://fedoraproject.org/wiki/Packaging:Ruby#Different_Interpreters_Compatibility

I'm not sure yet how much of that I'd (or Nick and the python maintainer
[bkabrda, the current python maintainer is the one who wrote the rubypick
script]) would want to use in python -- replacing /usr/bin/python with a
script that chooses between CPython and pypy based on user preference gave
me an instinctual feeling of dread the first time I looked at it but it
seems to be working well for the ruby folks.

My current feeling is that I wouldn't use this same system for interpreters
which are not mostly compatible (for instance, python2 vs python3).  but I
also haven't devoted much actual time to thinking about whether that might
have some advantages.

-Toshio

pgpKoSrX0710o.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 as a Default in Linux Distros

2013-07-24 Thread Toshio Kuratomi

Note: I'm the opposite number to bkabrda in the discussion on the Fedora
Lists about how quickly we should be breaking end-user expectations of what
"python" means.

On Wed, Jul 24, 2013 at 09:34:11AM -0400, Brett Cannon wrote:
> 
> 
> 
> On Wed, Jul 24, 2013 at 5:12 AM, Bohuslav Kabrda  wrote:
> 
> Hi all,
> in recent days, there has been a discussion on fedora-devel (see thread
> [1]) about moving to Python 3 as a default.
> I'd really love to hear opinions on the matter from the upstream, mainly
> regarding these two points (that are not that clearly defined in my
> original proposal and have been formed during the discussion):
> 
Note that the proposal is for Fedora 22.  So the timeframe for making the
switch in development is approximately 8 months from now.  Timeframe for
that release to be public is October 2014.

> - Should we point /usr/bin/python to Python 3 when we make the move?
> I know that pep 394 [2] deals with this and it says that /usr/bin/python
> may refer to Python 3 on some bleeding edge distributions - supposedly,
> this was added to the pep because of what Arch Linux did, not the other 
> way
> round.
> As the pep says, the recommendation of pointing /usr/bin/python to Python 
> 2
> may be changed after the Python 3 ecosystem is sufficiently mature. I'm
> wondering if there are any more specific criteria - list of big projects
> migrated/ported or something like that - or will this be judged by what 
> I'd
> call "overall spirit" in Python community (I hope you know what I mean by
> this)?
> In Fedora, we have two concerns that clash in this decision - being 
> "First"
> (e.g. actively promote and use new technologies and also suggest them to
> our users) vs. not breaking user expectations. So we figured it'd be a 
> good
> idea to ask upstream to get more opinions on this.
> 
> - What should user get after using "yum install python"?
> There are basically few ways of coping with this:
> 1) Just keep doing what we do, eventually far in the future drop "python"
> package and never provide it again (= go on only with python3/python4/...
> while having "yum install python" do nothing).
> 2) Do what is in 1), but when "python" is dropped, use virtual provide (*)
> "python" for python3 package, so that "yum install python" installs
> python3.
> 3), 4) Rename python to python2 and {don't add, add} virtual provide
> "python" in the same way that is in 1), 2)
>
4) Is my preference: python package becomes python2; Virtual Provide: python
means you'd get the python package is what I'd promote for now.  Users still
expect python2 when they talk about "python".  At some point in the future,
people will come to pycon and talks will apply to python3 unless otherwise
specified.  People writing new blog posts will say "python" and the code
they use as samples won't run on the python2 interpreter.  Expecting for
that to be the case in 12 months seems premature.

> 5) Rename python to python2 and python3 to python at one point. This makes
> sense to me from the traditional "one version in distro + possibly compat
> package shipping the old" approach in Linux, but some say that Python 2 
> and
> Python 3 are just different languages [3] and this should never be done.
> All of the approaches have their pros and cons, but generally it is all
> about what user should get when he tries to install python - either 
> nothing
> or python2 for now and python3 in future - and how we as a distro cope 
> with
> that on the technical side (and when we should actually do the switch).
> Just as a sidenote, IMO the package that gets installed as "python" (if
> any) should point to /usr/bin/python, which makes consider these two 
> points
> very closely coupled.
> 
> 
> A similar discussion broke out when Arch Linux switched python to point to
> python3. This led to http://www.python.org/dev/peps/pep-0394/ which says have
> python2/python3, and have python point at whatever makes the most sense to you
> based on your users and version uptake (option 3/4).

  I think bkabrda is looking for some clarification on PEP-394.  My
reading and participation in the previous discussions lead me to believe
that while PEP-394 wants to be diplomatic, the message it wants to get
across is:

1) warn distributions that what Arch was doing was premature.
2) provide a means to get them to switch at roughly the same time (when the
   recommendation in the PEP is flipped to suggest linking /usr/bin/python
   to /usr/bin/python3)

This is especially my reading from the Recommendations section of the PEP.
Unfortunately, we're getting stuck in the Abstract section which has this
bullet point:

* python should refer to the same target as python2 but may refer to python3
on some bleeding edge distributions

Knowing the history, I read this in two parts:
* Recommendation to dist

Re: [Python-Dev] Python 3 as a Default in Linux Distros

2013-07-24 Thread Toshio Kuratomi

On Wed, Jul 24, 2013 at 12:42:09PM -0400, Barry Warsaw wrote:
> On Jul 25, 2013, at 01:41 AM, Nick Coghlan wrote:
> 
> >How's this for an updated wording in the abstract:
> >
> >  * for the time being, all distributions should ensure that python
> >refers to the same target as python2
> >  * however, users should be aware that python refers to python3 on at
> >least Arch Linux (that change is
> >what prompted the creation of this PEP), so "python" should be
> >used in the shebang line only for
> >scripts that are source compatible with both Python 2 and 3
> 
> +1
> 
+1 as well.  Much clearer.

-Toshio


pgpgh3MdAJ43H.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 as a Default in Linux Distros

2013-07-25 Thread Toshio Kuratomi

On Jul 24, 2013 6:37 AM, "Brett Cannon"  wrote:
> The key, though, is adding python2 and getting your code to use that
binary  specifically so that shifting the default name is more of a
convenience than something which might break existing code not ready for
the switch.
>
Applicable to this, does anyone know whether distutils, setuptools,
distlib, or any of the other standard build+install tools are doing shebang
requiring?  Are they doing the right thing wrt python vs python2?

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 as a Default in Linux Distros

2013-07-25 Thread Toshio Kuratomi

On Thu, Jul 25, 2013 at 10:25:26PM +1000, Nick Coghlan wrote:
> On 25 July 2013 20:38, Toshio Kuratomi  wrote:
> >
> > On Jul 24, 2013 6:37 AM, "Brett Cannon"  wrote:
> >> The key, though, is adding python2 and getting your code to use that
> >> binary  specifically so that shifting the default name is more of a
> >> convenience than something which might break existing code not ready for 
> >> the
> >> switch.
> >>
> > Applicable to this, does anyone know whether distutils, setuptools, distlib,
> > or any of the other standard build+install tools are doing shebang
> > requiring?  Are they doing the right thing wrt python vs python2?
> 
> It occurs to me they're almost certainly using "sys.executable" to set
> the shebang line, so probably not :(
> 
> distutils-sig could probably offer a better answer though, most of the
> packaging folks don't hang out here.
> 
Thanks!

For other Linux distributors following along, here's my message to
distutils-sig:

http://mail.python.org/pipermail/distutils-sig/2013-July/022001.html

-Toshio


pgpRDSrmks3t3.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Offtopic: OpenID Providers

2013-09-05 Thread Toshio Kuratomi

On Thu, Sep 05, 2013 at 02:53:43PM -0400, Barry Warsaw wrote:
> 
> This probably isn't the only application of these technologies, but I've
> always thought about OAuth as delegating authority to scripts and programs to
> act on your behalf.  For example, you can write a script to interact with
> Launchpad's REST API, but before you can use the script, you have to interact
> with the web ui once (since your browser is trusted, presumably) to receive a
> token which the script can then use to prove that it's acting on your behalf.
> If at some point you stop trusting that script, you can revoke the token to
> disable its access, without having to reset your password.
> 
> To me, OpenID is about logging into web sites using single-sign on.  For
> example, once I've logged into Launchpad, I can essentially go anywhere that
> accepts OpenID, type my OpenID and generally not have to log in again (things
> like two-factor auth and such may change that interaction pattern).
> 
> Or to summarize to a rough approximation: OpenID is for logins, OAuth is for
> scripts.
> 
> Persona seems to fit the OpenID use case.  You'd still want OAuth for
> scripting.
> 
  However, in some cases, Persona/OpenID can make more sense for
scripts.  For instance, if you have a script that is primarily interactive
in nature, it may be better to have the user login via that script than to
have an OAuth token laying around on the filesystem all the time
(Contrariwise, if the script is primarily run from cron or similar, it's
better to have a token with limited permissions laying around on the
filesystem than your OpenID password ;-)

It's probably also useful to point out that OAuth (because it was developed
to let third party websites have limited permission to act on your behalf)
is more paranoid than strictly required for many scripts where that
"third-party" is a script that you've written running on a box that you
control.  If that's the main use case for your service, OAuth may not be
a good fit for your authz needs.

-Toshio


pgpK1y9fAvp9j.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Offtopic: OpenID Providers

2013-09-05 Thread Toshio Kuratomi

On Thu, Sep 05, 2013 at 10:25:22PM +0400, Oleg Broytman wrote:
> On Thu, Sep 05, 2013 at 02:16:29PM -0400, Donald Stufft  
> wrote:
> > 
> > On Sep 5, 2013, at 2:12 PM, Oleg Broytman  wrote:
> > >   I used to use myOpenID and became my own provider using poit[1].
> > > These days I seldom use OpenID -- there are too few sites that allow
> > > full-featured login with OpenID. The future lies in OAuth 2.0.
> > 
> > The Auth in OAuth stands for Authorization not Authentication.
> 
>There is no authorization without authentication, so OAuth certainly
> performs authentication: http://oauth.net/core/1.0a/#anchor9 ,
> http://tools.ietf.org/html/rfc5849#section-3
> 
Sortof The way OAuth looks to me, it's designed to prove that a given
client is authorized to perform an action.  It's not designed to prove that
the given client is a specific person.  In some cases, you really want to
know the latter and not merely the former.  So I think in these situations
Donald's separation of Authz and Authn makes sense.

-Toshio


pgppLjnxYjd1p.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Offtopic: OpenID Providers

2013-09-09 Thread Toshio Kuratomi

On Thu, Sep 5, 2013 at 6:09 PM, Stephen J. Turnbull  wrote:
> Barry Warsaw writes:

>  > We're open source, and I think it benefits our mission to support open,
>  > decentralized, and free systems like OpenID and Persona.
>
> Thus speaks an employee of yet another Provider-That-Won't-Accept-My-
> Third-Party-Credentials.  Sorry, Barry, but you see the problem:
> Unfortunately, we can't do it alone.  What needs to happen is there
> needs to be a large network of sites that support login via O-D-F
> systems like OpenID and Persona.  Too many of the sites I use (news
> sources, GMail, etc) don't support them and my browser manages my
> logins to most of them, so why bother learning OpenID, and then
> setting it up site by site?
>
[snipped lots of observations that I generally agree with]

There's been a lot of negativity towards OpenID in this thread -- I'd
like to say that in Fedora Infrastructure we've found OpenID to be
very very good -- but not at addressing the problem that most people
are after here.  As you've observed being an OpenID provider is a
relatively easy to swallow proposition; accepting OpenID from third
parties is another thing entirely.  As you've also observed, this has
to do with trust.  A site can trust their own account system and
practices and issue OpenID based on those.  It is much riskier for the
site to trust someone else's account system and practices when
deciding whether a user is actually the owner of the account that they
claim.

So OpenID fails as a truly generic SSO method across sites on the
internet... what have we found it good for then?  SSO within our site.
 More and more apps support OpenID out of the box.  Many web
frameworks have modules for the code you write to authenticate against
an OpenID server.  A site configures these apps and modules to only
trust the site's OpenID service and then deploys them with less custom
code.  Sites also get a choice about how much risk they consider
compromised accounts to a particular application.  If they run a web
forum and a build system for instance, they might constrain the build
system to only their OpenID service but allow the forum to allow
OpenID from other providers. And finally, having an openid service
lets their users sign into more trusting sites like python.org
properties (unlike say, LDAP) :-)

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] non-US zip archives support in zipfile.py

2013-10-17 Thread Toshio Kuratomi

On Tue, Oct 15, 2013 at 03:46:15PM +0200, "Martin v. Löwis" wrote:
> Am 15.10.13 14:49, schrieb Daniel Holth:
> > It is part of the ZIP specification. CP437 or UTF-8 are the two
> > official choices, but other encodings happen on Russian, Japanese
> > systems.
> 
> Indeed. Formally, the other encodings are not supported by the
> ZIP specification, and are thus formally misuse of the format.
> 
  But the tools in the wild misuse the format in this manner.
CP437 can encode any byte so zip and unzip on Linux, for instance, take the
bytes that represent the filename on the filesystem and use those in the zip
file without setting the utf-8 flag.  When the files are extracted, the same
byte sequence are used as the filename for the new files.

> I believe (without having proof) that early versions of the
> specification failed to discuss the file name encoding at all,
>
These might be helpful:

No mention of file name encodings in this version of the spec:
http://www.pkware.com/documents/APPNOTE/APPNOTE-6.2.2.TXT

Appendix D, Language Encoding, shows up here:
http://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.0.TXT

(Most recent version is 6.3.2)

> making people believe that it is unspecified and always the
> system encoding (which is useless, of course, as you create
> zip files to move them across systems).
>
Not always.  Backups are another use.  Also it's not useless.  If the files
are being moved within an organization (or in some cases geographical
regions have standardized on an encoding in practice), the same system
encoding could very well be in use on the machines where the files end up.

-Toshio


pgp9rjopytsng.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Toshio Kuratomi

On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno"  wrote:
>
> On 14 June 2016 at 12:19, Steven D'Aprano  wrote:
> > Is there
> > a good reason for returning bytes?
>
> What about: it returns 0-255 numeric values for each position in  a
stream, with
> no clue whatsoever to how those values map to text characters beyond
> the 32-128 range?
>
> Maybe base64.decode could take a "encoding" optional parameter - or
> there could  be
> a separate 'decote_to_text" method that would explicitly take a text
codec name.
> Otherwise, no, you simply can't take a bunch of bytes and say they
> represent text.
>
Although it's not explicit, the question seems to be about the output of
encoding (and for symmetry, the input of decoding).  In both of those
cases, valid output will consist only of ascii characters.

The input to encoding would have to remain bytes (that's the main purpose
of base64... to turn bytes into an ascii string).

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 538: Coercing the legacy C locale to a UTF-8 based locale

2017-05-04 Thread Toshio Kuratomi

On Sat, Mar 4, 2017 at 11:50 PM, Nick Coghlan  wrote:
>
> Providing implicit locale coercion only when running standalone
> ---
>
> Over the course of Python 3.x development, multiple attempts have been made
> to improve the handling of incorrect locale settings at the point where the
> Python interpreter is initialised. The problem that emerged is that this is
> ultimately *too late* in the interpreter startup process - data such as
> command
> line arguments and the contents of environment variables may have already
> been
> retrieved from the operating system and processed under the incorrect ASCII
> text encoding assumption well before ``Py_Initialize`` is called.
>
> The problems created by those inconsistencies were then even harder to
> diagnose
> and debug than those created by believing the operating system's claim that
> ASCII was a suitable encoding to use for operating system interfaces. This
> was
> the case even for the default CPython binary, let alone larger C/C++
> applications that embed CPython as a scripting engine.
>
> The approach proposed in this PEP handles that problem by moving the locale
> coercion as early as possible in the interpreter startup sequence when
> running
> standalone: it takes place directly in the C-level ``main()`` function, even
> before calling in to the `Py_Main()`` library function that implements the
> features of the CPython interpreter CLI.
>
> The ``Py_Initialize`` API then only gains an explicit warning (emitted on
> ``stderr``) when it detects use of the ``C`` locale, and relies on the
> embedding application to specify something more reasonable.
>

It feels like having a short section on the caveats of this approach
would help to introduce this section.  Something that says that this
PEP can cause a split in how Python behaves in non-sandalone
applications (mod_wsgi, IDEs where libpython is compiled in, etc) vs
standalone (unless the embedders take similar steps as standalone
python is doing).  Then go on to state that this approach was still
chosen as coercing in Py_Initialize is too late, causing the
inconsistencies and problems listed here.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-01 Thread Toshio Kuratomi

This is an excellent enumeration of some of the concerns!

One minor comment about the introductory material:

On Mon, Nov 1, 2021 at 5:21 AM Petr Viktorin  wrote:

> >
> > Introduction
> > 
> >
> > Python code is written in `Unicode`_ – a system for encoding and
> > handling all kinds of written language.

Unicode specifies the mapping of glyphs to code points.  Then a second
mapping from code points to sequences of bytes is what is actually
recorded by the computer.  The second mapping is what programmers
using Python will commonly think of as the encoding while the majority
of what you're writing about has more to do with the first mapping.
I'd try to word this in a way that doesn't lead a reader to conflate
those two mappings.

Maybe something like this?

  `Unicode`_ is a system for handling all kinds of written language.
It aims to allow any character from any human natural language (as
well as a few characters which are not from natural languages) to be
used. Python code may consist of almost all valid Unicode characters.

> > While this allows programmers from all around the world to express 
> > themselves,
> > it also allows writing code that is potentially confusing to readers.
> >

-Toshio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q2T3GKC6R6UH5O7RZJJNREG3XQDDZ6N4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Are "Batteries Included" still a Good Thing? [was: It's now time to deprecate the stdlib urllib module]

2022-03-28 Thread Toshio Kuratomi

On Sun, Mar 27, 2022, 11:07 AM Paul Moore  wrote:

> On Sun, 27 Mar 2022 at 17:11, Christopher Barker 
> wrote:
> > Back to the topic at hand, rather than remove urllib, maybe it could be
> made better -- an as-easy-to-use-as-requests package in the stdlib would be
> really great.
>
> I think that's where the mistake happens, though. Someone who needs
> "best of breed" is motivated (and likely knowledgeable enough) to make
> informed decisions about what's on PyPI. But someone who just wants to
> get the job done probably doesn't - and that's the audience for the
> stdlib. A stdlib module needs to be a good, reliable set of basic
> functionality that non-experts can use successfully. There can be
> better libraries on PyPI, but that doesn't mean the stdlib module is
> unnecessary, nor does it mean that the stdlib has to match the PyPI
> library feature for feature.
>
> So here, specifically, I'd rather see urlllib be the best urlllib it
> can be, and not demand that it turn into requests. Requests is there
> if people need/want it (as is httpx, and urllib3, and aiohttp). But
> urllib is for people who want to get a file from the web, and *not*
> have to deal with dependencies, 3rd party libraries, etc.
>


One thing about talking about "make urllib more like requests" that is
different than any of the other libs, though, is that requests aims to be
easier to use than anything else (which I note Chris Barker called out as
why he wanted urllib to be more like it).  I think that's important to
think about because i think ease of use is also the number one thing that
the audience you talk about is also looking for.

Of course, figuring out whether an api like request's is actually easier to
use than urllib or merely more featureful is open to debate.

-toshio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ZQC4H7HD3UXFT3CONU64YPOQBSPUTVY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Are "Batteries Included" still a Good Thing? [was: It's now time to deprecate the stdlib urllib module]

2022-03-30 Thread Toshio Kuratomi

On Tue, Mar 29, 2022, 10:55 AM Brett Cannon  wrote:

>
>
> On Tue, Mar 29, 2022 at 8:58 AM Ronald Oussoren 
> wrote:
>
>>
>>
>> On 29 Mar 2022, at 00:34, Brett Cannon  wrote:
>>
>>
>>
>> On Mon, Mar 28, 2022 at 11:52 AM Christopher Barker 
>> wrote:
>>
>>> On Mon, Mar 28, 2022 at 11:29 AM Paul Moore  wrote:
>>>

>> Having such a policy is a good thing and helps in evolving the stdlib,
>> but I wonder if the lack of such a document is the real problem.   IMHO the
>> main problem is that the CPython team is very small and therefore has
>> little bandwidth for maintaining, let alone evolving, large parts of the
>> stdlib.  In that it doesn’t help that some parts of the stdlib have APIs
>> that make it hard to make modifications (such as distutils where
>> effectively everything is part of the public API).  Shrinking the stdlib
>> helps in the maintenance burden, but feels as a partial solution.
>>
>
> You're right that is the fundamental problem. But for me this somewhat
> stems from the fact that we don't have a shared understanding of what the
> stdlib *is*,  and so the stdlib is a bit unbounded in its size and scope.
> That leads to a stdlib which is hard to maintain. It's just like dealing
> with any scarce resource: you try to cut back on your overall use as best
> as you can and then become more efficient with what you must still consume;
> I personally think we don't have an answer to the "must consume" part of
> that sentence that leads us to "cut back" to a size we can actually keep
> maintained so we don't have 1.6K open PRs
> .
>

One of the things that's often missed in discussions is that a *good*
policy document can also help grow the number of maintainers.

As just one example, i found two interesting items in the discussion
started by Skip about determining what modules don't have maintainers just
downstream if this. (1) There's a file which matches maintainers to modules
in the stdlib (this is documented but i only found out about it a few years
ago and Skip, who's been around even longer than me didn't know about it
either... So this means something about how our policy docs are currently
structured could be improved).  (2) Terry brought up that you don't have to
be a core maintainer in order to take up ownership of something in the
stdlib. That's awesome!  But this is definitely something i didn't know.
I've been "focusing"[1] on  becoming a core maintainer because i thought
that was a prerequisite to getting anything done in the stdlib. Knowing
that getting involved with stdlib maintenance is different could be vastly
helpful.

[1] focusing is the wrong word... It expresses the feeling of "directed
action" correctly but doesn't convey the lack of activity that sprinkles my
attempts.  Nor does it account for discouragement, helplessness, and
imposter-y feelings which are the reasons for that lack.

-toshio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GBNDUQXWTBGCP5243L4HUU5UVLKQ7UWB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: [python-committers] Resignation from Stefan Krah

2020-10-09 Thread Toshio Kuratomi

On Fri, Oct 9, 2020, 5:30 AM Christian Heimes  wrote:

> On 09/10/2020 04.04, Ivan Pozdeev via Python-Dev wrote:
> > I don't see the point of requiring to "write an apology", especially
> > *before a 12-month ban*. If they understand that their behavior is
> > wrong, there's no need for a ban, at least not such a long one; if they
> > don't, they clearly aren't going to write it, at least not now (they
> > might later, after a few weeks or months, having cooled down and thought
> > it over). So all it would achieve is public shaming AFAICS. Same issue
> > with the threat of "zero tolerance policy" -- it's completely
> > unnecessary and only serves to humiliate and alienate the recipient.
>
>
> I have been the victim of Stefan's CoC violations on more than one
> occasion. He added me to nosy list of a ticket just to offend and
> humiliate me. For this reason I personally asked the SC to make a
> sincere apology a mandatory requirement for Stefan's reinstatement as a
> core dev.
>
> I would have been fine with a private apology. However Stefan has also
> verbally attacked non-core contributors. In one case another core dev
> and I contacted the contribute in private to apologize and ensure that
> the contributor was not alienated by Stefan's attitude. Therefore it
> makes sense that the SC has requested a public, general apology.
>
> Why are you more concerned with the reputation of a repeated offender
> and not with the feelings of multiple victims of harassment? As a victim
> of Stefan's behavior I feel that an apology is the first step to
> reconcile and rebuild trust.
>

At the risk of putting my nose in where it doesn't belong... I think that
Ivan has some good general points.  And i think that they could be
distilled as this: if you are looking to correct bad behavior but allow a
contributor to learn about proper behavior and then return to the
community, the steps taken here seen counter-productive (1).  I would add a
second piece to that: If, on the other hand, the goal is to remove a toxic
person from the community whoneeds to go through major personality shifting
changes before they can be allowed back, then this may be appropriate (2).

For (1), what I'm getting from Ivan's post is that these measures are at a
level that few (if any) people would be willing to fulfill them and then
come back to be a non-bitter contributor. When the requirements are too
costly for the violator to pay, they won't be able to learn and then pay
those costs until they can disavow their former selves.  "i'm sorry i acted
like that; i was a *different person* back then. I'm sorry that *past me*
felt the need to hurt you."

I would think that in general, not necessarily this specific case, the
steering committee would want to try taking steps to get people to learn
proper behavior first and only resort to something which amounts to a de
facto permanent ban when it becomes apparent that the person has to go
through some major personality changes before their behavior will change.

For (2), the steering committee is charged with protecting the community at
large. A toxic person can cause great havoc by themselves and set the tone
of a community such that other people feel that engaging in bad behavior is
the proper thing to do in this community.  With that in mind, at some
point, this kind of action has to be on the table.  It is great that pep-13
lists banning as a possibility so that people know where their actions can
lead.

One thing i would suggest, though, is documenting and, in general,
following a sequence of progressively more strict interventions by the
steering committee.  I think that just as it is harmful to the community to
let bad behavior slide, it is also harmful to the community to not know
that the steering committee's enforcement is in measured steps which will
telegraph the committee's intentions and the member's responsibilities well
in advance.

This specific case may already have been out of hand by the time it came to
the committee, the steering committee is relatively new and problems could
have festered before they formed and started governing, but a new member of
the community should know that if they step out of line, the committee will
make it apparent to them what the expectations are and whether their
ongoing behavior is putting them onto a disciplinary track well before that
discipline gets to the point of a one year ban and a public apology.

Thanks for reading,
-Toshio

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IDFQDRHRA2JJ6OJAK2265UHCBEI45PIM/
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-Dev] Ext4 data loss

2009-03-12 Thread Toshio Kuratomi

Antoine Pitrou wrote:
> Steven D'Aprano  pearwood.info> writes:
>> It depends on what you mean by "temporary".
>>
>> Applications like OpenOffice can sometimes recover from an application 
>> crash or even a systems crash and give you the opportunity to restore 
>> the temporary files that were left lying around.
> 
> For such files, you want deterministic naming in order to find them again, so
> you won't use the tempfile module...
> 
Something that doesn't require deterministicly named tempfiles was Ted
T'so's explanation linked to earlier.

read data from important file
modify data
create tempfile
write data to tempfile
*sync tempfile to disk*
mv tempfile to filename of important file

The sync is necessary to ensure that the data is written to the disk
before the old file overwrites the new filename.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ext4 data loss

2009-03-12 Thread Toshio Kuratomi

Martin v. Löwis wrote:
>> Something that doesn't require deterministicly named tempfiles was Ted
>> T'so's explanation linked to earlier.
>>
>> read data from important file
>> modify data
>> create tempfile
>> write data to tempfile
>> *sync tempfile to disk*
>> mv tempfile to filename of important file
>>
>> The sync is necessary to ensure that the data is written to the disk
>> before the old file overwrites the new filename.
> 
> You still wouldn't use the tempfile module in that case. Instead, you
> would create a regular file, with the name base on the name of the
> important file.
> 
Uhm... why?  The requirements are:

1) lifetime of the temporary file is in control of the app
2) filename is available to the app so it can move it after data is written
3) temporary file can be created on the same filesystem as the important
file.

All of those are doable using the tempfile module.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ext4 data loss

2009-03-12 Thread Toshio Kuratomi

Martin v. Löwis wrote:
 The sync is necessary to ensure that the data is written to the disk
 before the old file overwrites the new filename.
>>> You still wouldn't use the tempfile module in that case. Instead, you
>>> would create a regular file, with the name base on the name of the
>>> important file.
>>>
>> Uhm... why?
> 
> Because it's much easier not to use the tempfile module, than to use it,
> and because the main purpose of the tempfile module is irrelevant to
> the specific application; the main purpose being the ability to
> auto-delete the file when it gets closed.
> 
auto-delete is one of the nice features of tempfile.  Another feature
which is entirely appropriate to this usage, though, though, is creation
of a non-conflicting filename.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ext4 data loss

2009-03-12 Thread Toshio Kuratomi

Martin v. Löwis wrote:
>> auto-delete is one of the nice features of tempfile.  Another feature
>> which is entirely appropriate to this usage, though, though, is creation
>> of a non-conflicting filename.
> 
> Ok. In that use case, however, it is completely irrelevant whether the
> tempfile module calls fsync. After it has generated the non-conflicting
> filename, it's done.
>
If you're saying that it shouldn't call fsync automatically I'll agree
to that.  The message thread I was replying to seemed to say that
tempfiles didn't need to support fsync because they will be useless
after a system crash.  I'm just refuting that by showing that it is
useful to call fsync on tempfiles as one of the steps in preserving the
data in another file.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread Toshio Kuratomi

Stephen J. Turnbull wrote:
> Chris Withers writes:
>
>  > - debian has an outdated and/or broken version of your package.
>
> True, but just as for the package system you are advocating, it's
> quite easy to set up your apt to use third-party repositories of
> Debian-style packages.  The question is whether those repositories
> exist.  Introducing yet another, domain-specific package manager will
> make it less likely that they do, and it will cause more work for
> downstream distributors like Debian and RH.
>
I haven't seen this mentioned so --

For many sites (including Fedora, the one I work on), the site maintains
a local yum/apt repository of packages that are necessary for getting
certain applications to run.  This way we are able to install a system
with a distribution that is maintained by other people and have local
additions that add more recent versions only where necessary.  This has
the following advantages:

1) We're able to track our changes to the base OS.
2) If the OS vendor releases an update that includes our fixes, we're
able to consume it without figuring out on which boxes we have to delete
what type of locally installed file (egg, jar, gem,
/usr/local/bin/program, etc).
3) We're using the OS vendor package management system for everything so
junior system admins can bootstrap a new machine with only familiarity
with that OS.  We don't have to teach them about rpm + eggs + gems +
where to find our custom repositories of each.
4) If we choose to, we can separate out different repositories for
different sets of machines.  Currently we have the main local repo and
one repo that only the builders pull from.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread Toshio Kuratomi

Steve Holden wrote:

> Seems to me that while all this is fine for developers and Python users
> it's completely unsatisfactory for people who just want to use Python
> applications. For them it's much easier if each application comes with
> all dependencies including the interpreter.
> 
> This may seem wasteful, but it removes many of the version compatibility
> issues that otherwise bog things down.
> 
The upfront cost of bundling is lower but the maintenance cost is
higher.  For instance, OS vendors have developed many ways of being
notified of and dealing with security issues.  If there's a security
issue with gtkmozdev and the python bindings to it have to be
recompiled, OS vendors will be alerted to it and have the opportunity to
release updates on zero day, the day that the security announcement goes
out.

Bundled applications suffer in two ways here:
1) the developers of the applications are unlikely to be on vendor-sec
and so the opportunity for zero day fixes is lower.

2) the developer becomes responsible for fixing problems with the
libraries, something that they often do not.  This is especially true
when developers start depending, not only on newer features of some
libraries, but older versions of others (for API changes).  It's not
clear to many developers that requiring a newer version of a library is
at least supported by upstream whereas requiring an older version leaves
them as the sole responsible party.

3) Over time, bundled libraries tend to become forked versions.  And
worse, privately forked versions.  If three python apps all use slightly
different older versions of libfoo-python and have backported fixes,
added new features, etc it is a nightmare for a system administrator or
packager to get them running with a single version from the system
library or forward port them.  And because they're private forks the
developers lose out on collaborating on security, bugfixes, etc because
they are doing their work in isolation from the other forks.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread Toshio Kuratomi

David Cournapeau wrote:
> 2009/3/24 Toshio Kuratomi :
>> Steve Holden wrote:
>>
>>> Seems to me that while all this is fine for developers and Python users
>>> it's completely unsatisfactory for people who just want to use Python
>>> applications. For them it's much easier if each application comes with
>>> all dependencies including the interpreter.
>>>
>>> This may seem wasteful, but it removes many of the version compatibility
>>> issues that otherwise bog things down.
>>>
>> The upfront cost of bundling is lower but the maintenance cost is
>> higher.  For instance, OS vendors have developed many ways of being
>> notified of and dealing with security issues.  If there's a security
>> issue with gtkmozdev and the python bindings to it have to be
>> recompiled, OS vendors will be alerted to it and have the opportunity to
>> release updates on zero day, the day that the security announcement goes
>> out.
> 
> I don't think bundling should be compared to depending on the system
> libraries, but as a lesser evil compared to requiring multiple,
> system-wide installed libraries.
> 
Well.. I'm not so sure it's even a win there.  If the libraries are
installed system-wide, at least the consumer of the application knows:

1) Where to find all the libraries to audit the versions when a security
issue is announced.
2) That the library is unforked from upstream.
3) That all the consumers of the library version have a central location
to collaborate on announcing fixes to the library.

With my distribution packager hat on, I can say I dislike both multiple
versions and bundling but I definitely dislike bundling more.

>> 3) Over time, bundled libraries tend to become forked versions.  And
>> worse, privately forked versions.  If three python apps all use slightly
>> different older versions of libfoo-python and have backported fixes,
>> added new features, etc it is a nightmare for a system administrator or
>> packager to get them running with a single version from the system
>> library or forward port them.  And because they're private forks the
>> developers lose out on collaborating on security, bugfixes, etc because
>> they are doing their work in isolation from the other forks.
> 
> This is a purely technical problem, and can be handled by good source
> control systems, no ?
> 
No.  This is a social problem.  Good source control only helps if I am
tracking upstream's trunk so I'm aware of the direction that their
changes are headed.  But there's a wide range of reasons that
application developers that bundle libraries don't do that:

1) not enough time in a day.  I'm working full-time on making my
application better.  Plus I have to update all these bundled libraries
from time to time, testing that the updates don't break anything.  I
don't have time to track trunk for all these libraries -- I barely have
time to track releases.

2) My release schedule doesn't mesh with all of the upstream libraries
I'm bundling.  When I want to release Foo-1.0, I want to have some
assurance that the libraries I'm bundling with will do the right thing.
   Since releases see more testing than trunk, tracking trunk for twenty
bundled libraries is a lot less attractive than tracking release branches.

3) This doesn't help with the fact that my bundled version of the
library and your bundled version of the library are being developed in
isolation from each other.  This needs central coordination which people
who believe bundling libraries are very unlikely to pursue.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread Toshio Kuratomi

Tres Seaver wrote:
> David Cournapeau wrote:
>>> I am afraid that distutils, and
>>> setuptools, are not really the answer to the problem, since while they
>>> may (as intended) guarantee that Python applications can be installed
>>> uniformly across different platforms they also more or less guarantee
>>> that Python applications are installed differently from all other
>>> applications on the platform.
>> I think they should be part of the solution, in the sense that they
>> should allow easier packaging for the different platforms (linux,
>> windows, mac os x and so on). For now, they make things much harder
>> than they should (difficult to follow the FHS, etc...).
> 
> FHS is something which packagers / distributors care about:  I strongly
> doubt that the "end users" will ever notice, particularly for silliness
> like 'bin' vs. 'sbin', or architecture-specific vs. 'noarch' rules.
> 
That's because you're thinking of a different class of end-user than FHS
 is targeting.  Someone who wants to install a web application on a
limited number of machines (one in the home-desktop scenario) or someone
who makes their living helping people to install the software they've
written has a whole different view on things than someone who's trying
to install and maintain the software on fifteen computer labs in a
campus or the person who is trying to write software that is portable to
tens of different platforms in their spare time and every bit of
answering end user's questions, tracking other upstreams for security
bugs, etc, is time taken away from coding.

Following FHS means that the software will work for both "end-users" who
don't care about the nitty-gritty of the FHS and system administrators
of large sites.  Disregarding the FHS because it is "silliness" means
that system administrators are going to have to special-case your
application, decide not to install it at all, or pay someone else to
support it.

Note that those things do make sense sometimes.  For instance, when an
application is not intended to be distributed to a large number of
outside entities (facebook, flikr, etc) or when your revenue stream is
making money from installing and administering a piece of software for
other companies.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-24 Thread Toshio Kuratomi

David Cournapeau wrote:
> On Wed, Mar 25, 2009 at 1:45 AM, Toshio Kuratomi 
wrote:
>> David Cournapeau wrote:
>>> 2009/3/24 Toshio Kuratomi :
>>>> Steve Holden wrote:
>>>>
>>>>> Seems to me that while all this is fine for developers and Python
users
>>>>> it's completely unsatisfactory for people who just want to use Python
>>>>> applications. For them it's much easier if each application comes with
>>>>> all dependencies including the interpreter.
>>>>>
>>>>> This may seem wasteful, but it removes many of the version
compatibility
>>>>> issues that otherwise bog things down.
>>>>>
>>>> The upfront cost of bundling is lower but the maintenance cost is
>>>> higher.  For instance, OS vendors have developed many ways of being
>>>> notified of and dealing with security issues.  If there's a security
>>>> issue with gtkmozdev and the python bindings to it have to be
>>>> recompiled, OS vendors will be alerted to it and have the
opportunity to
>>>> release updates on zero day, the day that the security announcement
goes
>>>> out.
>>> I don't think bundling should be compared to depending on the system
>>> libraries, but as a lesser evil compared to requiring multiple,
>>> system-wide installed libraries.
>>>
>> Well.. I'm not so sure it's even a win there.  If the libraries are
>> installed system-wide, at least the consumer of the application knows:
>>
>> 1) Where to find all the libraries to audit the versions when a security
>> issue is announced.
>> 2) That the library is unforked from upstream.
>> 3) That all the consumers of the library version have a central location
>> to collaborate on announcing fixes to the library.
>
> Yes, those are problems, but installing multi libraries have a lot of
> problems too:
>  - quickly, by enabling multiple version installed, people become very
> sloppy to handle versions of the dependencies, and this increases a
> lot the number of libraries installed - so the advantages above for
> system-wide installation  becomes intractable quite quickly

This is somewhat true.  Sloppiness and increased libraries are bad.  But
there are checks on this sloppiness.  Distributions, for instance, are
quite active about porting software to use only a subset of versions.
So in the open source world, there's a large number of players
interested in keeping the number of versions down.  Using multiple
libraries will point people at where work needs to be done whereas
bundling hides it behind the monolithic bundle.

>  - bundling also supports a real user-case which cannot be solved by
> rpm/deb AFAIK: installation without administration privileges.

This is only sortof true.  You can install rpms into a local directory
without root privileges with a commandline switch.  But rpm/deb are
optimized for system administrators so the documentation on doing this
is not well done.  There can also be code issues with doing things this
way but those issues can affect bundled apps as well. And finally, since
rpm's primary use is installing systems, the toolset around it builds
systems.  So it's a lot easier to build a private root filesystem than
it is to cherrypick a single package.  It should be possible to create a
tool that merges a system rpmdb and a user's local rpmdb using the
existing API but I'm not aware of any applications built to do that yet.

>  - multi-version installation give very fragile systems. That's
> actually my number one complain in python: setuptools has caused me
> numerous headache, and I got many bug reports because you often do not
> know why one version was loaded instead of another one.
>
I won't argue for setuptools' implementation of multi-version.  It
sucks.  But multi-version can be done well.  Sonames in C libraries are
a simple system that does this better.

> So I am not so convinced multiple-version is better than bundling - I
> can see how it sometimes can be, but I am not sure those are that
> important in practice.
>
Bundling is always harmful.  Whether multiple versioning is any better
is certainly debatable :-)

>> No.  This is a social problem.  Good source control only helps if I am
>> tracking upstream's trunk so I'm aware of the direction that their
>> changes are headed.  But there's a wide range of reasons that
>> application developers that bundle libraries don't do that:
>>
>> 1) not enough time in a day.  I'm working full-time on making my
>> application better.  Plus I have to update all these bundled libraries
>> from time to ti

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-25 Thread Toshio Kuratomi

Barry Warsaw wrote:

> Tools like setuptools, zc.buildout, etc. seem great for developers but
> not very good for distributions.  At last year's Pycon I think there was
> agreement from the Linux distributors that distutils, etc. just wasn't
> very useful for them.
> 
It's decent for modules but has limitations that we run up against
somewhat frequently.  It's a horror for applications.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-26 Thread Toshio Kuratomi

David Cournapeau wrote:
>> I won't argue for setuptools' implementation of multi-version.  It
>> sucks.  But multi-version can be done well.  Sonames in C libraries are
>> a simple system that does this better.
> 
> I would say simplistic instead of simple :) what works for C won't
> necessarily work for python - and even in C, library versioning is not
> used that often except for a few core libraries. Library versioning
> works in C because C model is very simple. It already breaks for C++.

I'm not sure what you're talking about here.  Library versioning is used
for practically every library on a Linux system.  My limited exposure to
the BSDs and Solaris was the same.  (If you're only talking Windows,
well; does windows even have Sonames?) I can name only one library that
isn't versioned in Fedora right now and may have heard of five total.
Perhaps you are thinking of library symbols?  If so, there are only a
few libraries that are using that.  But specifying backwards
compatibility via soname is well known and ubiquitous.

> More high-level languages like C# already have a more complicated
> scheme (GAC) - and my impression is that it did not work that well.
> The SxS for dll on recent windows to handle multiple version is a
> nightmare too in my (limited) experience.
> 
Looking at C#/Mono/.net for examples is perfectly horrid.  They've taken
inferior library versioning and bad development practices and added
technology (the GAC) as the solution.  If you want an idea of what
python should avoid at all costs, look to that arena for your answer.

* Note that setuptools' multi-version implementation shares some things
in common with the GAC.  For instance, using directories to separate
versions instead of filenames.  setuptools' implementation could be made
better by studying the GAC and taking things like caching of lookups
from it but I don't encourage this... I think the design itself is flawed.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-26 Thread Toshio Kuratomi

Guido van Rossum wrote:
> On Wed, Mar 25, 2009 at 9:40 PM, Tarek Ziadé  wrote:
>> I think Distutils (and therefore Setuptools) should provide some APIs
>> to play with special files (like resources) and to mark them as being 
>> special,
>> no matter where they end up in the target system.
>>
>> So the code inside the package can use these files seamessly no matter
>> what the system is
>> and no matter where the files have been placed by the packager.
>>
>> This has been discussed already but not clearly defined.
> 
> Yes, this should be done. PEP 302 has some hooks but they are optional
> and not available for the default case. A simple wrapper to access a
> resource file relative to a given module or package would be easy to
> add. It should probably support four APIs:
> 
> - Open as a binary stream
> - Open as a text stream
> - Get contents as a binary string
> - Get contents as a text string
> 
Depending on the definition of a "resource" there's additional
information that could be needed.  For instance, if resource includes
message catalogs, then being able to get the base directory that the
catalogs reside in is needed for passing to gettext.

I'd be very happy if "resource" didn't encompass that type of thing,
though... then we could have a separate interface that addressed the
issues with them.  I'll be at PyCon (flying in late tonight, though, and
leaving Sunday) if Tarek and others want to get ahold of me to discuss
possible ways to address what's a resource, what's not, and what we
would need to handle the different cases.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-27 Thread Toshio Kuratomi

Guido van Rossum wrote:
> 2009/3/26 Toshio Kuratomi :
>> Guido van Rossum wrote:
>>> On Wed, Mar 25, 2009 at 9:40 PM, Tarek Ziadé  wrote:
>>>> I think Distutils (and therefore Setuptools) should provide some APIs
>>>> to play with special files (like resources) and to mark them as being 
>>>> special,
>>>> no matter where they end up in the target system.
>>>>
>>>> So the code inside the package can use these files seamessly no matter
>>>> what the system is
>>>> and no matter where the files have been placed by the packager.
>>>>
>>>> This has been discussed already but not clearly defined.
>>> Yes, this should be done. PEP 302 has some hooks but they are optional
>>> and not available for the default case. A simple wrapper to access a
>>> resource file relative to a given module or package would be easy to
>>> add. It should probably support four APIs:
>>>
>>> - Open as a binary stream
>>> - Open as a text stream
>>> - Get contents as a binary string
>>> - Get contents as a text string
>>>
>> Depending on the definition of a "resource" there's additional
>> information that could be needed.  For instance, if resource includes
>> message catalogs, then being able to get the base directory that the
>> catalogs reside in is needed for passing to gettext.
> 
> Well the whole point is that for certain loaders (e.g. zip files)
> there *is* no base directory. If you do need directories you won't be
> able to use PEP-302 loaders, and you can just use
> os.path.dirname(.__file__).
> 
Yep.  Having no base directory isn't sufficient in all cases.

So one way to fix this is to define resources so that these cases fall
outside of that.

Current setuptools works around this by having API in pkg_resources that
unzips when it's necessary to use a filename rather than just retrieving
the data from the file.  So a second option is to have other API methods
  that allow this.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rethinking intern() and its data structure

2009-04-10 Thread Toshio Kuratomi

Robert Collins wrote:

> Certainly, import time is part of it:
> robe...@lifeless-64:~$ python -m timeit -s 'import sys;  import
> bzrlib.errors' "del sys.modules['bzrlib.errors']; import bzrlib.errors"
> 10 loops, best of 3: 18.7 msec per loop
> 
> (errors.py is 3027 lines long with 347 exception classes).
> 
> We've also looked lower - python does a lot of stat operations search
> for imports and determining if the pyc is up to date; these appear to
> only really matter on cold-cache imports (but they matter a lot then);
> in hot-cache situations they are insignificant.
> 
Tarek, Georg, and I talked about a way to do both multi-version and
speedup of this exact problem with import in the future at pycon.  I had
to leave before the hackfest got started, though, so I don't know where
the idea went from there.  Tarek, did this idea progress any?

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] #!/usr/bin/env python --> python3 where applicable

2009-04-20 Thread Toshio Kuratomi

Greg Ewing wrote:
> Steven Bethard wrote:
> 
>> That's an unfortunate decision. When the 2.X line stops being
>> maintained (after 2.7 maybe?) we're going to be stuck with the "3"
>> suffix forever for the "real" Python.
> 
> I don't see why we have to be stuck with it forever.
> When 2.x has faded into the sunset, we can start
> aliasing 'python' to 'python3' if we want, can't we?
> 
You could, but it's not my favorite idea.  Gets people used to the idea
of python == python2 and python3 == python3 as something they can count
on.  Then says, "Oops, that was just an implementation detail, we're
changing that now".  Much better to either make a clean break and call
the new language dialect python3 from now and forever or force people to
come up with solutions to whether /usr/bin/python == python2 or python3
right now while it's fresh and relevant in their minds.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-24 Thread Toshio Kuratomi

Glenn Linderman wrote:
> On approximately 4/24/2009 11:40 AM, came the following characters from
> And so my encoding (1) doesn't alter the data stream for any valid
> Windows file name, and where the naivest of users reside (2) doesn't
> alter the data stream for any Posix file name that was encoded as UTF-8
> sequences and doesn't contain ? characters in the file name [I perceive
> the use of ? in file names to be rare on Posix, because of experience,
> and because of the other problems caused by such use] (3) doesn't
> introduce data puns within applications that are correctly coded to know
> the encoding occurs.  The encoding technique in the PEP not only can
> produce data puns, thus not being reversible, it provides no reliable
> mechanism to know that this has occurred.
> 
Uhm  Not arguing with your goals but '?' is unfortunately reasonably
easy to get into a filename.  For instance, I've had to download a lot
of scratch built packages from our buildsystem recently.  Scratch builds
have url's with query strings in them so::

wget
'http://koji.fedoraproject.org/koji/getfile?taskID=1318059&name=monodevelop-debugger-gdb-2.0-1.1.i586.rpm'

Which results in the filename:
  getfile?taskID=1318059&name=monodevelop-debugger-gdb-2.0-1.1.i586.rpm

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-24 Thread Toshio Kuratomi

Terry Reedy wrote:

> Is NUL \0 allowed in POSIX file names?  If not, could that be used as an
> escape char.  If it is not legal, then custom translated strings that
> escape in the wild would raise a red flag as soon as something else
> tried to use them.
> 
AFAIK NUL should be okay but I haven't read a specification to reach
that conclusion.  Is that a proposal?  Should I go find someone who has
read the relevant standards to find out?

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Toshio Kuratomi

Zooko O'Whielacronx wrote:
> On Apr 28, 2009, at 6:46 AM, Hrvoje Niksic wrote:
>> If you switch to iso8859-15 only in the presence of undecodable UTF-8,
>> then you have the same round-trip problem as the PEP: both b'\xff' and
>> b'\xc3\xbf' will be converted to u'\u00ff' without a way to
>> unambiguously recover the original file name.
> 
> Why do you say that?  It seems to work as I expected here:
> 
 '\xff'.decode('iso-8859-15')
> u'\xff'
 '\xc3\xbf'.decode('iso-8859-15')
> u'\xc3\xbf'



 '\xff'.decode('cp1252')
> u'\xff'
 '\xc3\xbf'.decode('cp1252')
> u'\xc3\xbf'
> 

You're not showing that this is a fallback path.  What won't work is
first trying a local encoding (in the following example, utf-8) and then
if that doesn't work, trying a one-byte encoding like iso8859-15:

try:
file1 = '\xff'.decode('utf-8')
except UnicodeDecodeError:
file1 = '\xff'.decode('iso-8859-15')
print repr(file1)

try:
file2 = '\xc3\xbf'.decode('utf-8')
except UnicodeDecodeError:
file2 = '\xc3\xbf'.decode('iso-8859-15')
print repr(file2)


That prints:
  u'\xff'
  u'\xff'

The two encodings can map different bytes to the same unicode code point
 so you can't do this type of thing without recording what encoding was
used in the translation.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread Toshio Kuratomi

Martin v. Löwis wrote:
>> Since the serialization of the Unicode string is likely to use UTF-8,
>> and the string for  such a file will include half surrogates, the
>> application may raise an exception when encoding the names for a
>> configuration file. These encoding exceptions will be as rare as the
>> unusual names (which the careful I18N aware developer has probably
>> eradicated from his system), and thus will appear late.
> 
> There are trade-offs to any solution; if there was a solution without
> trade-offs, it would be implemented already.
> 
> The Python UTF-8 codec will happily encode half-surrogates; people argue
> that it is a bug that it does so, however, it would help in this
> specific case.

Can we use this encoding scheme for writing into files as well?  We've
turned the filename with undecodable bytes into a string with half
surrogates.  Putting that string into a file has to turn them into bytes
at some level.  Can we use the python-escape error handler to achieve
that somehow?

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-30 Thread Toshio Kuratomi

Thomas Breuel wrote:
> Not for me (I am using Python 2.6.2).
> 
> >>> f = open(chr(255), 'w')
> Traceback (most recent call last):
>  File "", line 1, in 
> IOError: [Errno 22] invalid mode ('w') or filename: '\xff'
> >>>
> 
> 
> You can get the same error on Linux:
> 
> $ python
> Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
> [GCC 4.3.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
 f=open(chr(255),'w')
> Traceback (most recent call last):
>   File "", line 1, in 
> IOError: [Errno 22] invalid mode ('w') or filename: '\xff'

> 
> (Some file system drivers do not enforce valid utf8 yet, but I suspect
> they will in the future.)
> 
Do you suspect that from discussing the issue with kernel developers or
reading a thread on lkml?  If not, then you're suspicion seems to be
pretty groundless

The fact that VFAT enforces an encoding does not lend itself to your
argument for two reasons:

1) VFAT is not a Unix filesystem.  It's a filesystem that's compatible
with Windows/DOS.  If Windows and DOS have filesystem encodings, then it
makes sense for that driver to enforce that as well.  Filesystems
intended to be used natively on Linux/Unix do not necessarily make this
design decision.

2) The encoding is specified when mounting the filesystem.  This means
that you can still mix encodings in a number of ways.  If you mount with
an encoding that has full byte coverage, for instance, each user can put
filenames from different encodings on there.  If you mount with utf8 on
a system which uses euc-jp as the default encoding, you can have full
paths that contain a mix of utf-8 and euc-jp.  Etc.

-Toshio

signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils ML wrap-up: setup.cfg new format

2009-09-23 Thread Toshio Kuratomi

On 09/23/2009 10:00 AM, Tarek Ziadé wrote:

> But you are right about the need of making sure every package management
> project is involved. We should make sure that Enthought,
> which has its own package management system, is part of that consensus.
> 
> Also, I am more concerned of not having enough end users involved in
> that process.
> End users would be: any python developer that needs
> to package his code, or any os packager that needs to package a python
> distribution
> for his system. But those are hard to get involved.
> 
As one of the people who deals with packaging python modules for
distributions, I'm sorry for not having spent more time looking into
this.  I simply haven't had the time lately.  One helpful resource for
engaging linux distributions is:

##distros on irc.freenode.net and their mailing list:
http://lists.freedesktop.org/mailman/listinfo/distributions

Both are low traffic but people from the various linux distributions are
watching and responding on it even if they don't make much noise on
their own.  Just bear in mind that sometimes they will need to connect
you with the right person within their distro rather than being able to
give feedback directly.

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 389: argparse - new command line parsing module

2009-10-02 Thread Toshio Kuratomi

On 09/29/2009 04:38 PM, Steven Bethard wrote:
> On Tue, Sep 29, 2009 at 3:04 PM, Glenn Linderman  
> wrote:
>> On approximately 9/29/2009 1:57 PM, came the following characters from the
>> keyboard of Steven Bethard:
>>> If you're not using argparse to write command line applications, then
>>> I don't feel bad if you have to do a tiny bit of extra work to take
>>> care of that use case. In this particular situation, all you have to
>>> do is subclass ArgumentParser and override exit() to do whatever you
>>> think it should do.
> [snip]
>>> There is only a single method in argparse that prints things,
>>> _print_message(). So if you want it to do something else, you can
>>> simply override it in a subclass. I can make that method public if
>>> this is a common use case.
>>
>> Documenting both of these options would forestall people from thinking it is
>> only useful for console applications.
> 
> I'm totally fine with people thinking it is only useful for console
> applications. That's what it's intended for. That said, if there are
> people out there who want to use it for other applications, I'm happy
> to make things easier for them if I know concretely what they want.
> 
Note: on Unix systems, --help should still print to the terminal, not
pop up a GUI text box with the help information.  So being able to
override the behaviour might be good but it is more than a simple, GUI
vs console distinction.  Are we talking about anything else than --help
output (for the printing question)?

About exit(), I agree with others about wanting to catch the exception
myself and then choosing to exit from the code.  I'm not sure that it's
actually useful in practice, though...it might just feel cleaner but not
actually be that helpful.

-Toshio



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-08 Thread Toshio Kuratomi

On Thu, Oct 08, 2009 at 01:27:57PM +0200, M.-A. Lemburg wrote:
> > Tarek Ziadé a écrit :
> >> But if PEP 376 and PEP 386 support are added in Python, we're not far
> >> from being able to provide multiple version support with
> >> the help of importlib.
> 
> Before putting much work into this: do you really think that having
> multiple versions of the same package in the same Python installation
> is a good idea ?
> 
I think it is a good idea.

> Examples:
> What if you have an application that uses two modules which each
> require two different versions of the same package ? Would that
> load the same package twice, causing globals used by these package
> to no work (e.g. isinstance(x, class_name) or global caches) ?
> 
That's not how it should work.  Look at other systems that allow for
installing multiple versions of a library -- for instance, loading dynamic
shared objects in C
* You can install multiple versions of a library in parallel
* The dynamic loader will pick the version of the library that is
  appropriate from the list of available options (the program specifies the
  SONAME it needs -- library name plus API version.  The loader then
  chooses the most recent revision that matches that API version.)
* When one binary needs multiple API versions of the library, the
  application cannot start.

The last point addresses your concern -- depending on multiple, incompatible
versions of a library is prohibited.  The programmer of the application
needs to make the code run with a single version of the code.

> This sounds a lot like DLL- or RPM-hell to me.
> 
RPM-hell (I'm not sure if DLL hell is the same, I have the vague impression
that it is the lack of enough version specification rather than too much but
I don't know for sure). is similar but it takes place on the end-user's
system.  This should take place on the programmer's system instead.
End-users are not in a position to fix things like RPM-hell.  Programmers
are.

Example RPM-hell:
Application Foo requires libbar-1.x
Application Baz requires libbar-2.x

The user may either have Foo or Baz installed on their system with the
appropriate libbar but not both.  They depend on the packagers and
developers of Foo and Bar to do one of the following to resolve the
situation:

* Port Foo and Baz to use the same version of libbar.
* Package libbar in such a way that libbar-1 and libbar-2 are parallel
  installable on the system.  Then they can install two separate packages,
  libbar1-1.0 and libbar2-2.0.

Example of similar Distutils multiple version problem:
The programmer creates an application Foo that depends on python-bar-1.x. He
has recently started work on a file that imports python-baz-1.0.  python-baz
depends on python-bar-2.x.  The first time he tries to run his new code,
python gives him an error message that it is impossible to satisfy the
version requirements for python-bar.  Depending on how the versions are
specified, the error message could be very specific and helpful:

  Impossible version requirements:
bar Requires: python-baz>=2.0, < 3.0
foo.py Requires: python-baz >=1.0, < 2.0

The programmer can then discard their new code, port foo.py to
python-baz-2.x, or port python-bar to python-baz-1.x and submit a patch to
the upstream of that module.  Note two things about this scenario:

1) The programmer is the person who is responsible for creating the conflict
and for resolving it.  They are the proper authority for making the decision
to port to python-baz-2.x or not using python-bar.  The end-user who is not
responsible is not impacted by this at all.
2) The programmer would have had to deal with this issue whether we allow
multiple versions to be installed or not.  With multiple version support we
may be able to get them better error messages (depending on how the
dependency information is formatted and how completely it was specified in
the app and modules).

> I think it's much better to keep things simple and under user
> control, e.g. by encouraging use of virtualenv-like setups
> and perhaps adding better native support for these to Python.
> 
> If the user finds that there's a version conflict this can
> then be resolved during environment setup instead of hoping
> for the best and waiting for application failures at run-time.
> 
For the class of user that is actually a developer, it might be somewhat
true that version conflicts should be resolved by them.  But for the class
of user that is an end-user, version conflicts are a totally foreign
concept.  They should be dealt with by the person who is coding the
application for them.

Also note, the ability to have multiple versions makes things easier for
system packagers and provides an easy alternative to a virtualenv for
end-users.

* System packagers: virtualenv does not provide a method suitable for system
  packagers.  The nearest adaptation would be for the system packager to
  install python packages into their own hierarchy not in the PYTHONPATH.
  Then they w

Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-08 Thread Toshio Kuratomi

On Thu, Oct 08, 2009 at 02:52:24PM +0200, Simon Cross wrote:
>  On Thu, Oct 8, 2009 at 10:31 AM, Tarek Ziadé  wrote:
> > = Virtualenv and the multiple version support in Distribute =
> ...
> > My opinion is that this tool exists only because Python doesn't
> > support the installation of multiple versions for the same
> > distributions.
> 
Let's actually look at these reasons:

> This is not at all how I use virtualenv. For me virtualenv is a
> sandbox so that I don't have to become root whenever I need to install
> a Python package for testing purposes

This is needing to install multiple versions and use the newly installed
version for testing.

> and to allow me to hop between
> sets of installed Python packages while developing on multiple Python
> projects.

This is the ability to install multiple versions and specify different
versions for different projects you're working on.

> I also use it as a sandbox for build bots so that multiple
> bots on the same machine can each build their own projects with just
> the known dependencies installed.
> 
This is the only use in the list that is virtualenv specific.  The rest are
cases of needing to install multiple versions on the system.

-Toshio


pgpFewnC5XGPp.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-08 Thread Toshio Kuratomi

On Thu, Oct 08, 2009 at 06:56:19PM +0200, kiorky wrote:
> 
> 
> Toshio Kuratomi a écrit :
> > 
> > Also note, the ability to have multiple versions makes things easier for
> > system packagers and provides an easy alternative to a virtualenv for
> > end-users.
> > 
> > * System packagers: virtualenv does not provide a method suitable for system
> 
> Yes, there is no doubt virtualenv is useless for system packagers but:
> 
> * System and applications deployment have not to be tied.
> It s up to the user to install things system wide or to use locally isolation
> technics. Virtualenv is one of those.
> As a conclusion, there are not very much problem for system packagers as if
> their users have specific needs, they will do something Outside the system.
> If not, they use their global python with packages installed in that global 
> one.
> 
This misses the point.  If there's two pieces of software to be deployed
via system packages and they use two different versions of a module, it's
currently not very easy to do this.  I do it with setuptools curently even
with all its warts.  Having a way to do this from within the stdlib that
tied directly into the import mechanism would make for a much cleaner
situation.

In other words, the suggestion that there is no need for a method to install
multiple versions of a module because virtualenv is a better solution is
bogus.  virtualenv is a better solution for creating isolated environments.
It is not a solution for all of the cases that being able to install
multiple versions of a library would be.

-Toshio


pgpP09FaEZeVL.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils and Distribute roadmap (and some?words?on Virtualenv, Pip)

2009-10-08 Thread Toshio Kuratomi

On Thu, Oct 08, 2009 at 04:28:52PM +, Antoine Pitrou wrote:
> Toshio Kuratomi  gmail.com> writes:
> > 
> > This is needing to install multiple versions and use the newly installed
> > version for testing.
> [...]
> 
> What you're missing is that having separate environments has a virtue of
> cleanliness, understandability and robustness that a multiple-versioned 
> solution
> doesn't have. While the technical merits are debatable I'm sure some people
> definitely prefer to manage a virtualenv-based version.
> 
I'm not missing it.  I'm only saying that the precise requirement that is
being stated is not sandboxing (that was listed later).  It's being able to
use a newly installed library for testing.  The essential element of that is
being able to install a new version of the library and use that instead of
the sytem installed version.  sandboxing may be how someone wants to do this
but it isn't essential to be able to do this.

-Toshio


pgp8e1xXN576p.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-08 Thread Toshio Kuratomi

On Thu, Oct 08, 2009 at 05:30:00PM +0100, Michael Foord wrote:
> Toshio Kuratomi wrote:
>> On Thu, Oct 08, 2009 at 02:52:24PM +0200, Simon Cross wrote:
>>   
>>>  On Thu, Oct 8, 2009 at 10:31 AM, Tarek Ziadé  wrote:
>>> 
>>>> = Virtualenv and the multiple version support in Distribute =
>>>>   
>>> ...
>>> 
>>>> My opinion is that this tool exists only because Python doesn't
>>>> support the installation of multiple versions for the same
>>>> distributions.
>>>>   
>> Let's actually look at these reasons:
>>
>>   
>>> This is not at all how I use virtualenv. For me virtualenv is a
>>> sandbox so that I don't have to become root whenever I need to install
>>> a Python package for testing purposes
>>> 
>>
>> This is needing to install multiple versions and use the newly installed
>> version for testing.
>>
>>   
>
> Not really - it is wanting to install a single version of a library that 
> you don't want to install into your 'main' (whether that be user or  
> system) Python install. It is sandboxing and orthogonal to multiple  
> versions.
>
What I'm replying to is specifically the need to: "become root whenever I
need to install a Python package for testing purposes" That doesn't require
sandboxing at all.  Can you use sandboxing to do this?  Yes.  But that is
separate to being able to install a non-system version of a library and be
able to test it.

My quoting of that phrase could have been better -- I missed the reference
to sandboxing since it is in a separate clause of the sentence from what I
was responding to.

>>>
>>> and to allow me to hop between
>>> sets of installed Python packages while developing on multiple Python
>>> projects.
>>> 
>>
>> This is the ability to install multiple versions and specify different
>> versions for different projects you're working on.
>>   
>
> No, it is working on multiple projects that have *different*  
> dependencies and being able to work in an environment that *only* has  
> direct dependencies installed - again sandboxed from your main Python  
> environment.
>
No. Here what is written is: "allow me to hop between sets of installed Python
packages while developing on multiple Python projects."

There's nothing about *only* having direct dependencies installed.  That's a
separate requirement than what was written.

> The fact that virtualenv allows you to have multiple different versions  
> of the same library installed in the different environments you create  
> is completely separate from the motivation that causes many people to  
> use it.
>
Precisely!  We see 100% eye-to-eye here.  My reply is just trying to say
that the ideas of
* testing a locally installed, conflicting version of a library
* running multiple projects with different, conflicting version requirements

are completely satisfiable without sandboxing.  Virtualenv is a sandboxing
tool.  It can be used to perform these tasks.  But it isn't necessary.
Having sandboxing is an additional feature on top of the base requirements
to perform the task.

> What virtualenv *doesn't* do (I believe) is allow you to have multiple  
> versions of libraries installed within a single environment and switch  
> between them (at least it doesn't offer anything beyond what setuptools  
> or pip provides).
>
Yep.  Which makes virtualenv unsuitable for certain other problem spaces
where sandboxing is inappropriate.

-Toshio


pgpzP7TagU4bO.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] supporting multiple versions of one package in one environment is insane

2009-10-09 Thread Toshio Kuratomi

On Fri, Oct 09, 2009 at 04:51:00PM +0100, Chris Withers wrote:
> Tarek Ziadé wrote:
>> Virtualenv allows you to create an isolated environment to install
>> some distribution without polluting the
>> main site-packages, a bit like a user site-packages.
>
> ...as does buildout, and these are the right type of solution to this  
> problem.
>
where type of problem == sandboxed environment, sure.  How do you solve the
problem for system packagers?

-Toshio


pgpIuAAzlNOuB.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] supporting multiple versions of one package in one environment is insane

2009-10-09 Thread Toshio Kuratomi

On Fri, Oct 09, 2009 at 05:29:28PM +0100, Chris Withers wrote:
> Toshio Kuratomi wrote:
>> On Fri, Oct 09, 2009 at 04:51:00PM +0100, Chris Withers wrote:
>>> Tarek Ziadé wrote:
>>>> Virtualenv allows you to create an isolated environment to install
>>>> some distribution without polluting the
>>>> main site-packages, a bit like a user site-packages.
>>> ...as does buildout, and these are the right type of solution to 
>>> this  problem.
>>>
>> where type of problem == sandboxed environment, sure.  How do you solve the
>> problem for system packagers?
>
> What's to stop a system packager either just running the buildout on  
> install, or running the buildout at package build time and then just  
> dropping the resulting environment wherever they want to install  
> applications? Such a package would only be dependent on the right python 
> version at runtime...
>
If buildout creates sandboxed environments like virtualenv then everything
here applies:

https://fedoraproject.org/wiki/Packaging:No_Bundled_Libraries

You can also listen/watch the talk I gave at PyCon which is linked from that
page:
http://pycon.blip.tv/file/2072580/

If it doesn't create sandboxed environments, then you'll need to give me
about a paragraph explaining what it does do.

-Toshio


pgpWXyj4ioW6W.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] buildtime vs runtime in Distutils

2009-11-16 Thread Toshio Kuratomi

On Sun, Nov 15, 2009 at 02:31:45PM +0100, Georg Brandl wrote:
> Antoine Pitrou schrieb:
> > Tarek Ziadé  gmail.com> writes:
> >> 
> >> This cannot work on all platforms, when our Makefile is not shipped
> >> with python but python-devel. (like Fedora)
> > 
> > This practice is stupid anyway, because it means you have to install
> > python-devel even to install pure Python packages with 
> > setuptools/distribute.
> > Just ask Fedora, Mandriva and friends to change their packaging practice
> > (Mandriva already has a bug open for that by the way).
> 
> +1.  They are the ones splitting what "make install" installs into several
> packages, so they are the ones who have to fix the resulting dependency
> problems.
> 
I agree with this, however, my point on the bug was more akin to this:

Tres Seaver wrote:
> Parsing the Makefile at runtime seems like an insane choice anyway to
> me:  +1 for your new module having constants generated at ./configure
> time.

Makefiles and C header files are not intended as general purpose data
formats.  Using them as such has a variety of disadvantages:

* If someone else wants to get at the data, they have to go through the API
  in distutils.  Any data that's not exposed by the API is unavailable.
* Since disturils doesn't implement a full parser for the make and C syntax
  it is possible to break distutils when making legitimate changes to those
  build files.

These are the reasons I opened the bug to get that information into a real
data file rather than parsing the Makefile and header files.

I'll also mention two further things:

The reason that python-devel was split off was to make it more useful for
livecds, olpc, embedded systems, and other places where disk space is at a
premimum.  Being able to combine an operating system that is used by people
beyond your immediate community is great for finding and fixing bugs before
your users run into them.  Being able to program in a high level language on
these platforms has benefits that I'm sure everyone here can appreciate.

I've brought the issue of Makefile and pyconfig.h being needed for distutils
to the attention of every new Fedora python maintainer since the package
split was made.  The current maintainer, David Malcolm, agrees that
distutils.sysconfig needs to be able to use this data and he has moved the
Makefile and header files into the main python package.  This doesn't change
the problems with using a Makefile and C header files as a data format for
python.

-Toshio

pgpwjwTL7cYrI.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Toshio Kuratomi

On Thu, Jan 21, 2010 at 12:25:59PM +, Antoine Pitrou wrote:
> > We seek guidance from the community on
> > an acceptable level of increased memory usage.
> 
> I think a 10-20% increase would be acceptable.
> 
I'm just a user of the core interpreter but the bottleneck in using python
in my environment has almost always been memory usage and almost never speed
of execution.  Note, though, that we run multiple python apps at a time so
anything that's shared between interpreters is less costly than anything that
must be unique.

Still, any growth in memory usage is painful since that's already the
limiting resource.

> > 32-bit; gcc 4.0.3
> > 
> > +-+---+---+--+ |
> > Binary size | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
> > +=+===+===+==+ |
> > Release | 3.8M  | 4.0M  |  74M |
> > +-+---+---+--+ |
> 
> This is positively humongous. Is there any way to shrink these numbers 
> dramatically (I'm talking about the release builds)? Large executables or 
> libraries may make people anxious about the interpreter's memory 
> efficiency; and they will be a nuisance in many situations (think making 
> standalone app bundles using py2exe or py2app).
> 
Binary size has an impact on linux distributions (and people who make
embedded systems from those distributions).  This kind of growth would push
the interpreter out of the livecds that we make and prevent shipping other
useful software on our DVDs and other media.

Somebody suggested building two interpeters in another part of this thread
(jit-python and "normal" python).  That has both pros and cons in situations
like this:

1) For some programs that we want to use the jit-python binary if available,
we'd need to implement some method of detecting its presence and using it if
present.
2) We'd need to specify that both jit-python and python are able to run this
program successfully.
3) If the compiled modules (byte code or C compiled) are incompatible
between the two interpreters that will make us cry.  (we're already shipping
two versions of things for python2.x and python3.x, also shipping a version
for jit-python would be... suboptimal.)

-Toshio

pgpOFpN8U0kT3.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Toshio Kuratomi

On Thu, Jan 21, 2010 at 09:32:23AM -0800, Collin Winter wrote:
> Hi Dirkjan,
> 
> On Wed, Jan 20, 2010 at 10:55 PM, Dirkjan Ochtman  wrote:
> > For some apps (like Mercurial, which I happen to sometimes hack on),
> > increased startup time really sucks. We already have our demandimport
> > code (I believe bzr has something similar) to try and delay imports,
> > to prevent us spending time on imports we don't need. Maybe it would
> > be possible to do something like that in u-s? It could possibly also
> > keep track of the thorny issues, like imports where there's an except
> > ImportError that can do fallbacks.
> 
> I added startup benchmarks for Mercurial and Bazaar yesterday
> (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
> can use them as more macro-ish benchmarks, rather than merely starting
> the CPython binary over and over again. If you have ideas for better
> Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
> hg_startup and bzr_startup benchmarks should give us some more data
> points for measuring improvements in startup time.
> 
This is great!

> One idea we had for improving startup time for apps like Mercurial was
> to allow the creation of hermetic Python "binaries", with all
> necessary modules preloaded. This would be something like Smalltalk
> images. We haven't yet really fleshed out this idea, though.
> 
Coming from a background building packages for a Linux distribution I'd like
to know what you're designing here.  There's may things that you could mean,
most of them having problems.  An image with all of the modules contained in
it is like a statically linked binary in the C world.  This can bloat our
livemedia installs, make memory usage go up if the modules were
sharable before and no longer are, cause pain in hunting down affected
packages and getting all of the rebuilt packages to users when a security
issue is found in a library, and, for some strange reason, encourages
application authors to bundle specific versions of libraries with their
apps.

So if you're going to look into this,please be careful and try to minimize
the tradeoffs that can occur.

Thanks,
-Toshio


pgpWt8kT5xbkd.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Add UTC to 2.7 (PyCon sprint idea)

2010-02-16 Thread Toshio Kuratomi

On Wed, Feb 17, 2010 at 09:15:25AM +0700, Stuart Bishop wrote:
> 
> The Debian, Ubuntu and I think Redhat packages all use the system
> zoneinfo database - there are hooks in there to support package
> maintainers that want to do this.

Where RedHat == Fedora && EPEL packages for RHEL/Centos 5, yes :-)

-Toshio


pgpbt9HFJA542.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] what to do if you don't want your module in Debian

2010-04-26 Thread Toshio Kuratomi

On Mon, Apr 26, 2010 at 05:46:55PM -0400, Barry Warsaw wrote:
> On Apr 26, 2010, at 09:39 PM, Tarek Ziadé wrote:
> 
> >You should be permissive on that one. Until we know how to describe resource
> >files properly, __file__ is what developer use when they need their projects
> >to be portable..
> 
> Until then, isn't pkg_resources the best practice for this?  (I'm pretty sure
> we've talked about this before.)
> 
I would have to say no to this.  Best practice from the Linux packager POV
would be something like this

foo/
foo/__init__.py
foo/paths.py::

  # Global paths where resources are installed
  HELPDIR='/usr/share/foo/help'
  TEMPLATEDIR='/usr/share/foo/templates'
  CACHEDIR='/var/cache/foo'
  DBDIR='/var/lib/foo/db'
  PRIVATEMODDIR='/usr/share/foo/foolib'
  PLUGINDIR='/usr/lib/foo/plugins'
  LOCALEDIR='/usr/share/locale'

foo/do_things.py::
  import foo.paths
  import os.path
  # connect to the db
  db = foo_connect(os.path.join(foo.paths.DBDIR, 'foodb.sqlite'))

Using this strategy, you, the developer, can set the default paths to
whatever makes the most sense for your target but the packager can go
through and patch new locations in a single file that are used throughout
your program.

Note that you can use pkg_resources.resource_filename() in the above
scenario for static resources (but not stateful ones like database files,
caches, and the like).  Unfortunately, pkg_resources.resource_stream() can't
be adapted to the same purpose.

A better version of resource_stream() based on the work that Tarek's been
doing in the backend would be possible for making a resource_stream() capable
of handling settable paths as it would pull the pathname determinaiton to
the level behind the resource_stream call.

-Toshio

pgptNJ2DTRocb.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] what to do if you don't want your module in Debian

2010-04-27 Thread Toshio Kuratomi

On Tue, Apr 27, 2010 at 09:41:02AM +0200, Tarek Ziadé wrote:
> On Tue, Apr 27, 2010 at 1:24 AM, Toshio Kuratomi  wrote:
> > On Mon, Apr 26, 2010 at 05:46:55PM -0400, Barry Warsaw wrote:
> >> On Apr 26, 2010, at 09:39 PM, Tarek Ziadé wrote:
> >>
> >> >You should be permissive on that one. Until we know how to describe 
> >> >resource
> >> >files properly, __file__ is what developer use when they need their 
> >> >projects
> >> >to be portable..
> >>
> >> Until then, isn't pkg_resources the best practice for this?  (I'm pretty 
> >> sure
> >> we've talked about this before.)
> >>
> > I would have to say no to this.  Best practice from the Linux packager POV
> > would be something like this
> >
> > foo/
> > foo/__init__.py
> > foo/paths.py::
> >
> >  # Global paths where resources are installed
> >  HELPDIR='/usr/share/foo/help'
> >  TEMPLATEDIR='/usr/share/foo/templates'
> >  CACHEDIR='/var/cache/foo'
> >  DBDIR='/var/lib/foo/db'
> >  PRIVATEMODDIR='/usr/share/foo/foolib'
> >  PLUGINDIR='/usr/lib/foo/plugins'
> >  LOCALEDIR='/usr/share/locale'
> >
> > foo/do_things.py::
> >  import foo.paths
> >  import os.path
> >  # connect to the db
> >  db = foo_connect(os.path.join(foo.paths.DBDIR, 'foodb.sqlite'))
> >
> > Using this strategy, you, the developer, can set the default paths to
> > whatever makes the most sense for your target but the packager can go
> > through and patch new locations in a single file that are used throughout
> > your program.
> >
> 
> You are making the assumption that the developers know what are the
> global paths on each platform.
>
No, I'm not.  The developer needs to establish sane categories, but doesn't
need to worry about the exact paths.  For instance, this would be perfectly
fine:

foo/path.py::
  HELPDIR=os.path.join(os.dirname(__file__), 'help')
  TEMPLATEDIR=pkg_resources.resource_filename('foo', 'templates')
  CACHEDIR=os.path.join(os.environ.get('HOME', '/tmp'), 'foocache')

Then the individual platform packagers can patch the single file, paths.py
according to the neecds oftheir platform.

> I don't think people would do that unless we
> provide these paths already, and that's basically the goal of the next PEP.
> 
s/paths/categories/ and I'll agree.   As I said, the PEP does a lot of
things right in this area.  We're able to push decisions about filesystem
paths out to a higher level using the PEP whereas the current state of the
art has us needing to figure it all out as individual developers :-(

[...]
> 
> Until then, the only approach a developer has to make sure he can access to 
> his
> resource files, is to have them alongside the code.
> 
I don't think so -- but this scheme certainly allows that.  I think that
many developers who are targeting Linux will find it more natural to specify
FHS compliant paths for their files.  Someone who is developing an app to be
portable will likely find that placing files alongside code is more natural
(although even this isn't truly portable -- CACHEDIR and other stateful
files will break under the assumption that you can write to a file/directory
alongside the module).

And like I say, this is just about the best workaround available now.
Implementation of the PEP makes this area much better.

-Toshio


pgpKhyjpUdu25.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 09:57:30AM -0400, Barry Warsaw wrote:
> On Jun 21, 2010, at 09:37 AM, Arc Riley wrote:
> 
> >Also, under where it mentions that most OS's do not include Python 3, it
> >should be noted which have good support for it.  Gentoo (for example) has
> >excellent support for Python 3, automatically installing Python packages
> >which have Py3 support for both Py2 and Py3, and the python-based Portage
> >package system runs cleanly on Py2.6, Py3.1 and Py3.2.
> 
> We're trying to get there for Ubuntu (driven also by Debian).  We have Python
> 3.1.2 in main for Lucid, though we will probably not get 3.2 into Maverick
> (the October 2010 release).  We're currently concentrating on Python 2.7 as a
> supported version because it'll be released by then, while 3.2 will still be
> in beta.
> 
> If you want to help, or have complaints, kudos, suggestions, etc. for Python
> support on Ubuntu, you can contact me off-list.
> 
 Fedora 14 is about the same.  A nice to have thing that goes along
with these would be a table that has packages ported to python3 and which
distributions have the python3 version of the package.

Once most of the important third party packages are ported to python3 and in
the distributions, this table will likely become out-dated and probably
should be reaped but right now it's a very useful thing to see.

-Toshio


pgp4ovCkaMeKl.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 11:43:07AM -0400, Barry Warsaw wrote:
> On Jun 21, 2010, at 10:20 PM, Nick Coghlan wrote:
> 
> >Something that may make sense to ease the porting process is for some
> >of these "on the boundary" I/O related string manipulation functions
> >(such as os.path.join) to grow "encoding" keyword-only arguments. The
> >recommended approach would be to provide all strings, but bytes could
> >also be accepted if an encoding was specified. (If you want to mix
> >encodings - tough, do the decoding yourself).
> 
> This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
> for it.
> 
> Would it make sense to have "encoding-carrying" bytes and str types?
> Basically, I'm thinking of types (maybe even the current ones) that carry
> around a .encoding attribute so that they can be automatically encoded and
> decoded where necessary.  This at least would simplify APIs that need to do
> the conversion.
> 
> By default, the .encoding attribute would be some marker to indicated "I have
> no idea, do it explicitly" and if you combine ebytes or estrs that have
> incompatible encodings, you'd either throw an exception or reset the .encoding
> to IAmConfuzzled.  But say you had an email header like:
> 
> =?euc-jp?b?pc+l7aG8pe+hvKXrpcmhqg==?=
> 
> And code like the following (made less crappy):
> 
> -snip snip-
> class ebytes(bytes):
> encoding = 'ascii'
> 
> def __str__(self):
> s = estr(self.decode(self.encoding))
> s.encoding = self.encoding
> return s
> 
> 
> class estr(str):
> encoding = 'ascii'
> 
> 
> s = str(b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa', 
> 'euc-jp')
> b = bytes(s, 'euc-jp')
> 
> eb = ebytes(b)
> eb.encoding = 'euc-jp'
> es = str(eb)
> print(repr(eb), es, es.encoding)
> -snip snip-
> 
> Running this you get:
> 
> b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa' ハローワールド！ 
> euc-jp
> 
> Would it be feasible?  Dunno.  Would it help ease the bytes/str confusion?
> Dunno.  But I think it would help make APIs easier to design and use because
> it would cut down on the encoding-keyword function signature infection.
> 
I like the idea of having encoding information carried with the data.
I don't think that an ebytes type that can *optionally* have an encoding
attribute makes the situation less confusing, though.  To me the biggest
problem with python-2.x's unicode/bytes handling was not that it threw
exceptions but that it didn't always throw exceptions.  You might test this
in python2::
t = u'cafe'
function(t)

And say, ah my code works.  Then a user gives it this::
t = u'café'
function(t)

And get a unicode error because the function only works with unicode in the
ascii range.

ebytes seems to have the same pitfall where the code path exercised by your
tests could work with::
eb = ebytes(b)
eb.encoding = 'euc-jp'
function(eb)

but the user exercises a code path that does this and fails::
eb = ebytes(b)
function(eb)

What do you think of making the encoding attribute a mandatory part of
creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).

-Toshio


pgpc4qEcxzofr.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

2010-06-21 Thread Toshio Kuratomi

On Tue, Jun 22, 2010 at 01:08:53AM +0900, Stephen J. Turnbull wrote:
> Lennart Regebro writes:
> 
>  > 2010/6/21 Stephen J. Turnbull :
>  > > IMO, the UI is right.  "Something" like the above "ought" to work.
>  > 
>  > Right. That said, many times when you want to do urlparse etc they
>  > might be binary, and you might want binary. So maybe the methods
>  > should work with both?
> 
> First, a caveat: I'm a Unicode/encodings person, not an experienced
> web programmer.  My opinions on whether this would work well in
> practice should be taken with a grain of salt.
> 
> Speaking for myself, I live in a country where the natives have
> saddled themselves with no less than 4 encodings in common use, and I
> would never want "binary" since none of them would display as anything
> useful in a traceback.  Wherever possible, I decode "blobs" into
> structured objects, I do it as soon as possible, and if for efficiency
> reasons I want to do this lazily, I store the blob in a separate
> .raw_object attribute.  If they're textual, I decode them to text.  I
> can't see an efficiency argument for decoding URIs lazily in most
> applications.
> 
> In the case of structured text like URIs, I would create a separate
> class for handling them with string-like operations.  Internally, all
> text would be raw Unicode (ie, not url-encoded); repr(uri) would use
> some kind of readable quoting convention (not url-encoding) to
> disambiguate random reserved characters from separators, while
> str(uri) would produce an url-encoded string.  Converting to and from
> wire format is just .encode and .decode, then, and in this country you
> need to be flexible about which encoding you use.
> 
> Agreed, this stuff is really annoying.  But I think that just comes
> with the territory.  PJE reports that folks don't like doing encoding
> and decoding all over the place.  I understand that, but if they're
> doing a lot of that, I have to wonder why.  Why not define the one
> line function and get on with life?
> 
> The thing is, where I live, it's not going to be a one line function.
> I'm going to be dealing with URLs that are url-encoded representations
> of UTF-8, Shift-JIS, EUC-JP, and occasionally RFC 2047!  So I need an
> API that explicitly encodes and decodes.  And I need an API that
> presents Japanese as Japanese rather than as line noise.
> 
> Eg, PJE writes
> 
> Ugh.  I meant: 
> 
> newurl = urljoin(str(base, 'latin-1'), 'subdir').encode('latin-1')
> 
> Which just goes to the point of how ridiculous it is to have to  
> convert things to strings and back again to use APIs that ought to  
> just handle bytes properly in the first place. 
> 
> But if you need that "everywhere", what's so hard about
> 
> def urljoin_wrapper (base, subdir):
> return urljoin(str(base, 'latin-1'), subdir).encode('latin-1')
> 
> Now, note how that pattern fails as soon as you want to use
> non-ISO-8859-1 languages for subdir names.  In Python 3, the code
> above is just plain buggy, IMHO.  The original author probably will
> never need the generalization.  But her name will be cursed unto the
> nth generation by people who use her code on a different continent.
> 
> The net result is that bytes are *not* a programmer- or user-friendly
> way to do this, except for the minority of the world for whom Latin-1
> is a good approximation to their daily-use unibyte encoding (eg, it's
> probably usable for debugging in Dansk, but you won't win any
> popularity contests in Tel Aviv or Shanghai).
> 
One comment here -- you can also have uri's that aren't decodable into their
true textual meaning using a single encoding.

Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp
components inside of their path but the textual representation that was intended
will be garbled (or be represented by escaped byte sequences).  For that
matter, apache will serve requests that have no true textual representation
as it is working on the byte level rather than the character level.

So a complete solution really should allow the programmer to pass in uris as
bytes when the programmer knows that they need it.

-Toshio


pgpAvx546YBxD.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 01:24:10PM -0400, P.J. Eby wrote:
> At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >What do you think of making the encoding attribute a mandatory part of
> >creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).
> 
> As long as the coercion rules force str+ebytes (or str % ebytes,
> ebytes % str, etc.) to result in another ebytes (and fail if the str
> can't be encoded in the ebytes' encoding), I'm personally fine with
> it, although I really like the idea of tacking the encoding to bytes
> objects in the first place.
> 
I wouldn't like this.  It brings us back to the python2 problem where
sometimes you pass an ebyte into a function and it works and other times you
pass an ebyte into the function and it issues a traceback.  The coercion
must end up with a str and no traceback (this assumes that we've checked
that the ebyte and the encoding "match" when we create the ebyte).

If you want bytes out the other end, you should either have a different
function or explicitly transform the output from str to bytes.

So, what's the advantage of using ebytes instead of bytes?

* It keeps together the text and encoding information when you're taking
  bytes in and want to give bytes back under the same encoding.
* It takes some of the boilerplate that people are supposed to do (checking
  that bytes are legal in a specific encoding) and writes it into the
  initialization of the object.  That forces you to think about the issue
  at two points in the code:  when converting into ebytes and when
  converting out to bytes.  For data that's going to be used with both
  str and bytes, this is the accepted best practice.  (For exceptions, the
  byte type remains which you can do conversion on when you want to).

-Toshio

pgpjsqwszNbF7.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 02:46:57PM -0400, P.J. Eby wrote:
> At 02:58 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
> >Nick alluded to the The One Obvious Way as a change in architecture.
> >
> >Specifically: Decode all bytes to typed objects (str, images, audio,
> >structured objects) at input.  Do no manipulations on bytes ever
> >except decode and encode (both to text, and to special-purpose objects
> >such as images) in a program that does I/O.
> 
> This ignores the existence of use cases where what you have is text
> that can't be properly encoded in unicode.  I know, it's a hard thing
> to wrap one's head around, since on the surface it sounds like
> unicode is the programmer's savior.  Unfortunately, real-world text
> data exists which cannot be safely roundtripped to unicode, and must
> be handled in "bytes with encoding" form for certain operations.
> 
> I personally do not have to deal with this *particular* use case any
> more -- I haven't been at NTT/Verio for six years now.  But I do know
> it exists for e.g. Asian language email handling, which is where I
> first encountered it.  At the time (this *may* have changed), many
> popular email clients did not actually support unicode, so you
> couldn't necessarily just send off an email in UTF-8.  It drove us
> nuts on the project where this was involved (an i18n of an existing
> Python app), and I think we had to compromise a bit in some fashion
> (because we couldn't really avoid unicode roundtripping due to
> database issues), but the use case does actually exist.
> 
> My current needs are simpler, thank goodness.  ;-)  However, they
> *do* involve situations where I'm dealing with *other*
> encoding-restricted legacy systems, such as software for interfacing
> with the US Postal Service that only works with a restricted subset
> of latin1, while receiving mangled ASCII from an ecommerce provider,
> and storing things in what's effectively a latin-1 database.  Being
> able to easily assert what kind of bytes I've got would actually let
> me catch errors sooner, *if* those assertions were being checked when
> different kinds of strings or bytes were being combined.  i.e., at
> coercion time).
> 
While it's certainly possible that you have a grapheme that has no
corresponding unicode codepoint, it doesn't sound like this is the case
you're dealing with here.  You talk about "restricted subset of latin1"
but all of latin1's graphemes have unicode codepoints.  You also talk about
not being able to "send off an email in UTF-8" but UTF-8 is an encoding of
unicode, not unicode itself.  Similarly, the statement that some email
clients don't support unicode isn't very clear as to actual problem.  The
email client supports displaying graphemes using glyphs present on the
computer.  As long as the graphemes needed have a unicode codepoint, using
unicode inside of your application and then encoding to bytes on the way out
works fine.

Even in cases where there's no unicode codepoint for the grapheme that
you're receiving unicode gives you a way out.  It provides you a private use
area where you can map the graphemes to unused codepoints.  Your
application keeps a mapping from that codepoint to the particular byte
sequence that you want.  Then write you a codec that converts from unicode w/
these private codepoints into your particular encoding (and from bytes into
unicode).

-Toshio


pgp0riTqgpAbp.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 04:09:52PM -0400, P.J. Eby wrote:
> At 03:29 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >On Mon, Jun 21, 2010 at 01:24:10PM -0400, P.J. Eby wrote:
> >> At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >> >What do you think of making the encoding attribute a mandatory part of
> >> >creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).
> >>
> >> As long as the coercion rules force str+ebytes (or str % ebytes,
> >> ebytes % str, etc.) to result in another ebytes (and fail if the str
> >> can't be encoded in the ebytes' encoding), I'm personally fine with
> >> it, although I really like the idea of tacking the encoding to bytes
> >> objects in the first place.
> >>
> >I wouldn't like this.  It brings us back to the python2 problem where
> >sometimes you pass an ebyte into a function and it works and other times you
> >pass an ebyte into the function and it issues a traceback.
> 
> For stdlib functions, this isn't going to happen unless your ebytes'
> encoding is not compatible with the ascii subset of unicode, or the
> stdlib function is working with dynamic data...  in which case you
> really *do* want to fail early!
> 
The ebytes encoding will often be incompatible with the ascii subset.
It's the reason that people were so often tempted to change the
defaultencoding on python2 to utf8.

> I don't see this as a repeat of the 2.x situation; rather, it allows
> you to cause errors to happen much *earlier* than they would
> otherwise show up if you were using unicode for your encoded-bytes
> data.
> 
> For example, if your program's intent is to end up with latin-1
> output, then it would be better for an error to show up at the very
> *first* point where non-latin1 characters are mixed with your data,
> rather than only showing up at the output boundary!
> 
That highly depends on your usage.  If you're formatting a comment on a web
page, checking at output and replacing with '?' is better than a traceback.
If you're entering key values into a database, then you likely want to know
where the non-latin1 data is entering your program, not where it's mixed
with your data or the output boundary.

> However, if you promoted mixed-type operation results to unicode
> instead of ebytes, then you:
> 
> 1) can't preserve data that doesn't have a 1:1 mapping to unicode, and
> 
ebytes should be immutable like bytes and str.  So you shouldn't lose the
data if you keep a reference to it.

> 2) can't detect an error until your data reaches the output point in
> your application -- forcing you to defensively insert ebytes calls
> everywhere (vs. simply wrapping them around a handful of designated
> inputs), or else have to go right back to tracing down where the
> unusable data showed up in the first place.
> 
Usually, you don't want to know where you are combining two incompatible
strings.  Instead, you want to know where the incompatible strings are being
set in the first place.  If function(a, b) tracebacks with certain
combinations of a and b I need to know where a and b are being set, not
where function(a, b) is in the source code.  So you need to be making input
values ebytes() (or str in current python3) no matter what.

> One thing that seems like a bit of a blind spot for some folks is
> that having unicode is *not* everybody's goal.  Not because we don't
> believe unicode is generally a good thing or anything like that, but
> because we have to work with systems that flat out don't *do*
> unicode, thereby making the presence of (fully-general) unicode an
> error condition that has to be stamped out!
> 
I think that sometimes as well.  However, here I think you're in a bit of
a blind spot yourself.  I'm saying that making ebytes + str coerce to ebytes
will only yield a traceback some of the time; which is the python2
behaviour.  Having ebytes + str coerce to str will never throw a traceback
as long as our implementation checks that the bytes and encoding work
together fro mthe start.

Throwing an error in code, only on some input is one of the main reasons
that debugging unicode vs byte issues sucks on python2.  On my box, with my
dataset, everything works.  Toss it up on pypi and suddenly I have a user in
Japan who reports that he gets a traceback with his dataset that he can't
give to me because it's proprietary, overly large, or transient.



> IOW, if you're producing output that has to go into another system
> that doesn't take unicode, it doesn't matter how
> theoretically-correct it would be for your app to process the data in
> unicode form.  In that case, unicode is not a feature: i

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Toshio Kuratomi

On Mon, Jun 21, 2010 at 04:52:08PM -0500, John Arbash Meinel wrote:
> 
> ...
> >> IOW, if you're producing output that has to go into another system
> >> that doesn't take unicode, it doesn't matter how
> >> theoretically-correct it would be for your app to process the data in
> >> unicode form.  In that case, unicode is not a feature: it's a bug.
> >>
> > This is not always true.  If you read a webpage, chop it up so you get
> > a list of words, create a histogram of word length, and then write the 
> > output as
> > utf8 to a database.  Should you do all your intermediate string operations
> > on utf8 encoded byte strings?  No, you should do them on unicode strings as
> > otherwise you need to know about the details of how utf8 encodes characters.
> > 
> 
> You'd still have problems in Unicode given stuff like å =~ å even though
> u'\xe5' vs u'a\u030a' (those will look the same depending on your
> Unicode system. IDLE shows them pretty much the same, T-Bird on Windosw
> with my current font shows the second as 2 characters.)
> 
> I realize this was a toy example, but it does point out that Unicode
> complicates the idea of 'equality' as well as the idea of 'what is a
> character'. And just saying "decode it to Unicode" isn't really sufficient.
> 
Ah -- but if you're dealing with unicode objects you can use the
unicodedata.normalize() function on them to come out with the right values.
If you're using bytes, it's yet another case where you, the programmer, have
to know what byte sequences represent combining characters in the particular
encoding that you're dealing with.

-Toshio


pgpF7cCCZvokU.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

2010-06-22 Thread Toshio Kuratomi

On Tue, Jun 22, 2010 at 11:58:57AM +0900, Stephen J. Turnbull wrote:
> Toshio Kuratomi writes:
> 
>  > One comment here -- you can also have uri's that aren't decodable into 
> their
>  > true textual meaning using a single encoding.
>  > 
>  > Apache will happily serve out uris that have utf-8, shift-jis, and
>  > euc-jp components inside of their path but the textual
>  > representation that was intended will be garbled (or be represented
>  > by escaped byte sequences).  For that matter, apache will serve
>  > requests that have no true textual representation as it is working
>  > on the byte level rather than the character level.
> 
> Sure.  I've never seen that combination, but I have seen Shift JIS and
> KOI8-R in the same path.
> 
> But in that case, just using 'latin-1' as the encoding allows you to
> use the (unicode) string operations internally, and then spew your
> mess out into the world for someone else to clean up, just as using
> bytes would.
> 
This is true.  I'm giving this as a real-world counter example to the
assertion that URIs are "text".  In fact, I think you're confusing things
a little by asserting that the RFC says that URIs are text.  I'll address
that in two sections down.

>  > So a complete solution really should allow the programmer to pass
>  > in uris as bytes when the programmer knows that they need it.
> 
> Other than passing bytes into a constructor, I would argue if a
> complete solution requires, eg, an interface that allows
> urljoin(base,subdir) where the types of base and subdir are not
> required to match, then it doesn't belong in the stdlib.  For stdlib
> usage, that's premature optimization IMO.
> 
I'll definitely buy that.  Would urljoin(b_base, b_subdir) => bytes and
urljoin(u_base, u_subdir) => unicode be acceptable though?  (I think, given
other options, I'd rather see two separate functions, though.  It seems more
discoverable and less prone to taking bad input some of the time to have two
functions that clearly only take one type of data apiece.)

> The RFC says that URIs are text, and therefore they can (and IMO
> should) be operated on as text in the stdlib.

If I'm reading the RFC correctly, you're actually operating on two different
levels here.  Here's the section 2 that you quoted earlier, now in its
entirety::
2.  Characters

   The URI syntax provides a method of encoding data, presumably for the
   sake of identifying a resource, as a sequence of characters.  The URI
   characters are, in turn, frequently encoded as octets for transport or
   presentation.  This specification does not mandate any particular
   character encoding for mapping between URI characters and the octets used
   to store or transmit those characters.  When a URI appears in a protocol
   element, the character encoding is defined by that protocol; without such
   a definition, a URI is assumed to be in the same character encoding as
   the surrounding text.

   The ABNF notation defines its terminal values to be non-negative integers
   (codepoints) based on the US-ASCII coded character set [ASCII].  Because
   a URI is a sequence of characters, we must invert that relation in order
   to understand the URI syntax.  Therefore, the integer values used by the
   ABNF must be mapped back to their corresponding characters via US-ASCII
   in order to complete the syntax rules.

   A URI is composed from a limited set of characters consisting of digits,
   letters, and a few graphic symbols.  A reserved subset of those
   characters may be used to delimit syntax components within a URI while
   the remaining characters, including both the unreserved set and those
   reserved characters not acting as delimiters, define each component's
   identifying data.

So here's some data that matches those terms up to actual steps in the
process::

  # We start off with some arbitrary data that defines a resource.  This is
  # not necessarily text.  It's the data from the first sentence:
  data = b"\xff\xf0\xef\xe0"

  # We encode that into text and combine it with the scheme and host to form
  # a complete uri.  This is the "URI characters" mentioned in section #2.
  # It's also the "sequence of characters mentioned in 1.1" as it is not
  # until this point that we actually have a URI.
  uri = b"http://host/"; + percentencoded(data)
  # 
  # Note1: percentencoded() needs to take any bytes or characters outside of
  # the characters listed in section 2.3 (ALPHA / DIGIT / "-" / "." / "_"
  # / "~") and percent encode them.  The URI can only consist of characters
  # from this set and the reserved character set (2.2).
  #
  # Note2: in this simplistic example, we're onl

Re: [Python-Dev] bytes / unicode

2010-06-22 Thread Toshio Kuratomi

On Tue, Jun 22, 2010 at 08:31:13PM +0900, Stephen J. Turnbull wrote:
> Toshio Kuratomi writes:
>  > unicode handling redesign.  I'm stating my reading of the RFC not to defend
>  > the use case Philip has, but because I think that the outlook that non-text
>  > uris (before being percentencoded) are violations of the RFC
> 
> That's not what I'm saying.  What I'm trying to point out is that
> manipulating a bytes object as an URI sort of presumes a lot about its
> encoding as text.

I think we're more or less in agreement now but here I'm not sure.  What
manipulations are you thinking about?  Which stage of URI construction are
you considering?

I've just taken a quick look at python3.1's urllib module and I see that
there is a bit of confusion there.  But it's not about unicode vs bytes but
about whether a URI should be operated on at the real URI level or the
data-that-makes-a-uri level.

* all functions I looked at take python3 str rather than bytes so there's no
  confusing stuff here
* urllib.request.urlopen takes a strict uri.  That means that you must have
  a percent encoded uri at this point
* urllib.parse.urljoin takes regular string values
* urllib.parse and urllib.unparse take regular string values

> Since many of the URIs we deal with are more or
> less textual, why not take advantage of that?
>
Cool, so to summarize what I think we agree on:

* Percent encoded URIs are text according to the RFC.
* The data that is used to construct the URI is not defined as text by the
  RFC.
* However, it is very often text in an unspecified encoding
* It is extremely convenient for programmers to be able to treat the data
  that is used to form a URI as text in nearly all common cases.

-Toshio

pgpDvecDxPAjV.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-22 Thread Toshio Kuratomi

On Tue, Jun 22, 2010 at 08:24:28AM -0500, Michael Urman wrote:
> On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull  wrote:
> > Michael Urman writes:
> >
> >  > It is somewhat troublesome that there doesn't appear to be an obvious
> >  > built-in idempotent-when-possible function that gives back the
> >  > provided bytes/str,
> >
> > If you want something idempotent, it's already the case that
> > bytes(b'abc') => b'abc'.  What might be desirable is to make
> > bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
> > (or maybe ISO 8859/1).
> 
> By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
> errors) that would pass an instance of bytes through, or encode an
> instance of str. And of course a to_str that performs similarly,
> passing str through and decoding bytes. While bytes(b'abc') will give
> me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me
> the b'abc' I want to see.
> 
A month or so ago, I finally broke down and wrote a python2 library that had
these functions in it (along with a bunch of other trivial boilerplate
functions that I found myself writing over and over in different projects)

  
https://fedorahosted.org/releases/k/i/kitchen/docs/api-text-converters.html#unicode-and-byte-str-conversion

I suppose I could port this to python3 and we could see if it gains adoption
as a thirdparty addon.  I have been hesitating over doing that since I don't
use python3 for everyday work and I have a vague feeling that 2to3 won't
understand what that code needs to do.

-Toshio


pgpi8QfNv3gC0.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

2010-06-23 Thread Toshio Kuratomi

On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote:
> On Wed, 23 Jun 2010 14:23:33 -0400
> Tres Seaver  wrote:
> > - - the slow adoption / porting rate of major web frameworks and libraries
> >   to Python 3.
> 
> Some of the major web frameworks and libraries have a ton of
> dependencies, which would explain why they really haven't bothered yet.
> 
> I don't think you can't claim, though, that Python 3 makes things
> significantly harder for these frameworks. The proof is that many of
> them already give the user unicode strings in Python 2.x. They must
> have somehow got the decoding right.
> 
Note that this assumption seems optimistic to me.  I started talking to Graham
Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
do decoding of bytes to unicode at different layers which caused problems
for application level code that should otherwise run fine when being served
by mod_wsgi or paste httpserver.  That was the beginning of Graham starting
to talk about what the wsgi spec really should look like under python3
instead of the broken way that the appendix to the current wsgi spec states.

-Toshio


pgpRSbaUGJzcz.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

2010-06-23 Thread Toshio Kuratomi

On Wed, Jun 23, 2010 at 11:35:12PM +0200, Antoine Pitrou wrote:
> On Wed, 23 Jun 2010 17:30:22 -0400
> Toshio Kuratomi  wrote:
> > Note that this assumption seems optimistic to me.  I started talking to 
> > Graham
> > Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
> > do decoding of bytes to unicode at different layers which caused problems
> > for application level code that should otherwise run fine when being served
> > by mod_wsgi or paste httpserver.  That was the beginning of Graham starting
> > to talk about what the wsgi spec really should look like under python3
> > instead of the broken way that the appendix to the current wsgi spec states.
> 
> Ok, but the reason would be that the WSGI spec is broken. Not Python 3
> itself.
> 
Agreed.  Neither python2 nor python3 is broken.  It's the wsgi spec and the
implementation of that spec where things fall down.  From your first post,
I thought you were claiming that python3 was broken since web frameworks got
decoding right on python2 and I just wanted to defend python3 by showing
that python2 wasn't all sunshine and roses.

-Toshio


pgp8xQXfAPrYT.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Licensing

2010-07-06 Thread Toshio Kuratomi

On Tue, Jul 06, 2010 at 10:10:09AM +0300, Nir Aides wrote:
> I take "...running off with the good stuff and selling it for profit" to mean
> "creating derivative work and commercializing it as proprietary code" which 
> you
> can not do with GPL licensed code. Also, while the GPL does not prevent 
> selling
> copies for profit it does not make it very practical either.
> 
Uhmmm http://finance.yahoo.com/q/is?s=RHT&annual

It is very possible to make money with the GPL.  The GPL does, as you say,
prevents you from creating derivative works that are proprietary code.  It
does *not* prevent you from creating derivative works and commercializing
it.

-Toshio


pgpInicmKNFs3.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fixing #7175: a standard location for Python config files

2010-08-12 Thread Toshio Kuratomi

On Fri, Aug 13, 2010 at 07:48:22AM +1000, Nick Coghlan wrote:
> 2010/8/12 Éric Araujo :
> >> Choosing an arbitrary location we think is good on every system is fine
> >> and non risky I think, as long as Python let the various distribution
> >> change those paths though configuration.
> >
> > Don’t you have a bootstrapping problem? How do you know where to look at
> > the sysconfig file that tells where to look at config files?

I'd hardcode a list of locations.
  [os.path.join(os.path.dirname(__file__), 'sysconfig.cfg'),
   os.path.join('/etc', 'sysconfig.cfg')]

The distributor has a limited choice of options on where to look.

A good alternative would be to make the config file overridable.  That way
you can have sysconfig.cfg next to sysconfig.py or in a known config
directory relative to the python stdlib install but also let the
distributions and individual sites override the defaults by making changes
to /etc/python3/sysconfig.cfg for instance.

> 
> Personally, I'm not clear on what a separate syconfig.cfg file offers
> over clearly separating the directory configuration settings and
> continuing to have distributions patch sysconfig.py directly. The
> bootstrapping problem (which would encourage classifying synconfig.cfg
> as source code and placing it alongside syscongig.py) is a major part
> of that point of view.
> 
Here's some advantages but some of them are of dubious worth:

* Allows users/site-administrators to change paths and not have packaging
  systems overwrite the changes.
* Makes it conceptually cleaner to make this overridable via user defined
  config files since  it's now a matter of parsing several config files
  instead of having a hardcoded value in the file and overridable values
  outside of it.
* Allows sites to add additional paths to the config file.
* Makes it clear to distributions that the values in the config file are
  available for making changes to rather than having to look for it in code
  and not know the difference between thaat or say, the encoding parameter
  in python2.
* Documents the format to use for overriding the paths if individual sites
  can override the defaults that are shipped in the system version of
  python.

-Toshio

pgpBEZ2XsDBy9.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fixing #7175: a standard location for Python config files

2010-08-12 Thread Toshio Kuratomi

On Fri, Aug 13, 2010 at 03:15:28AM +0200, Éric Araujo wrote:
> > A good alternative would be to make the config file overridable.  That way
> > you can have sysconfig.cfg next to sysconfig.py or in a known config
> > directory relative to the python stdlib install but also let the
> > distributions and individual sites override the defaults by making changes
> > to /etc/python3/sysconfig.cfg for instance.
> 
> Putting configuration files next to source code does not agree with the
> FHS.
>
That depends -- sometimes a config file is actually a data file.  In
following the FHS, a linux distribtuion and system administrators want to
make sure a sysadmin never needs to modify a file outside of /etc for
configuration.  However, if you had a config file that is installed with
python and not meant to be modified next to the code and the sysadmin is
able to customize python's behaviour by copying lines from that file to
a file in /etc/ that they do modify, that's perfectly fine.  Where
developers often run afoul of the FHS is creating a configuration option
that does not live in a file in /etc/ (either because they think it's data
but system administrators routinely want to modify.it or because they don't
care about the FHS and just want to put everything in a single directory)
and hte system admin has no means to set that option permanently but to
modify the files outside of /etc/.

>
> Your message gave me an idea though: Some function could be called
> in sitecustomize (symlinked from /etc on Debian e.g.) to set the path to
> sysconfig.cfg.
> 
I hope that you mean that the actual file lives in /etc/ and then a symlink
in the stdlib points to it.  That sounds acceptable from an FHS view but
it's a bad idea.  Most of the config files in /etc/ are declarative.  A few
of the files in /etc/ are imperative (for instance, init scripts) and
distributions frequently debate whether these are actually config files or
files misplaced in /etc/ for historical reasons.  Fedora, for instance, does
not mark initscripts as config -- all configuration of initscripts are done
by setting shell variables in a different file which is marked config.

(Files marked config in a package manager generally are treated differently
on upgrades.  If the files have been modified since install, they are either
not replaced or, when replaced, a backup is made so that the system
administrator can merge their local changes.)

The problem with having imperative constructs in /etc/ is that they're much
harder (perhaps impossible) to fix up when changes have to be made at the
system packaging level.  Let's say that you hard code this into the
sitecustomize.py when you ship it for the first time on Fedora:

  siteconfig_cfg = '/etc/siteconfig.cfg'
 
The system administrator installs this package and then does some unrelated
customization for his site in the file:

  if sys.version_info[:2] == '3.0':
  sys.path.append('/opt/python3.0')
  siteconfig_cfg = '/etc/siteconfig30.cfg'
  elif sys.version_info[:2] == '3.1':
  sys.path.append(os.environ['USER'] + '/opt/python3.1')
  siteconfig_cfg = '/etc/siteconfig31.cfg'
  elif sys.version_info[:2] == '3.2':
  sys.path.append(os.environ['USER'] + '/opt/python3.2')
  siteconfig_cfg = '/etc/siteconfig.cfg'
  # [...] Do other strange stuff because if it's possible, someone will.

Now, when we update to python3.3 we decide to move siteconfig.cfg into
/etc/python3.3/.  But, because the system admin has modified that file so
heavily, we don't update that file.  This means the updated python is broken
out of the box.

If you're enamoured of symlinks, you can do this directly with the config
file instead of using sitecustomize::

  /etc/python3.2/sysconfig.cfg
  /usr/lib64/python3.2/sysconfig.cfg => /etc/python3.2/sysconfig.cfg

as a third alternative.

-Toshio


pgpCYTDznfbDR.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] (Not) delaying the 3.2 release

2010-09-16 Thread Toshio Kuratomi

On Thu, Sep 16, 2010 at 09:52:48AM -0400, Barry Warsaw wrote:
> On Sep 16, 2010, at 11:28 PM, Nick Coghlan wrote:
> 
> >There are some APIs that should be able to handle bytes *or* strings,
> >but the current use of string literals in their implementation means
> >that bytes don't work. This turns out to be a PITA for some networking
> >related code which really wants to be working with raw bytes (e.g.
> >URLs coming off the wire).
> 
> Note that email has exactly the same problem.  A general solution -- even if
> embodied in *well documented* best-practices and convention -- would really
> help make the stdlib work consistently, and I bet third party libraries too.
> 
I too await a solution with abated breath :-) I've been working on
documenting best practices for APIs and Unicode and for this type of
function (take bytes or unicode and output the same type), knowing the
encoding is seems like a requirement in most cases:

http://packages.python.org/kitchen/designing-unicode-apis.html#take-either-bytes-or-unicode-output-the-same-type

I'd love to add another strategy there that shows how you can robustly
operate on bytes without knowing the encoding but from writing that, I think
that anytime you simplify your API you have to accept limitations on the
data you can take in.  (For instance, some simplifications can handle
anything except ASCII-incompatible encodings).

-Toshio


pgpAJSHDGRHtD.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] (Not) delaying the 3.2 release

2010-09-16 Thread Toshio Kuratomi

On Thu, Sep 16, 2010 at 10:56:56AM -0700, Guido van Rossum wrote:
> On Thu, Sep 16, 2010 at 10:46 AM, Martin (gzlist)  
> wrote:
> > On 16/09/2010, Guido van Rossum  wrote:
> >>
> >> In all cases I can imagine where such polymorphic functions make
> >> sense, the necessary and sufficient assumption should be that the
> >> encoding is a superset of 7-bit(*) ASCII. This includes UTF-8, all
> >> Latin-N variant, and AFAIK also the popular CJK encodings other than
> >> UTF-16. This is the same assumption made by Python's byte type when
> >> you use "character-based" methods like lower().
> >
> > Well, depends on what exactly you're doing, it's pretty easy to go wrong:
> >
> > Python 3.2a2+ (py3k, Sep 16 2010, 18:43:45) [MSC v.1500 32 bit (Intel)] on 
> > win32
> > Type "help", "copyright", "credits" or "license" for more information.
>  import os, sys
>  os.path.split("C:\\十")
> > ('C:\\', '十')
>  os.path.split("C:\\十".encode(sys.getfilesystemencoding()))
> > (b'C:\\\x8f', b'')
> >
> > Similar things can catch out web developers once they step outside the
> > percent encoding.
> 
> Well, that character is not 7-bit ASCII. Of course things will go
> wrong there. That's the whole point of what I said, isn't it?
> 
You were talking about encodings that were supersets of 7-bit ASCII.
I think Martin was demonstrating a byte string that was a superset of 7-bit
ASCII being fed to a stdlib function which went wrong.

-Toshio


pgpTUIwKWOepG.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] We should be using a tool for code reviews

2010-09-30 Thread Toshio Kuratomi

On Wed, Sep 29, 2010 at 01:23:24PM -0700, Guido van Rossum wrote:
> On Wed, Sep 29, 2010 at 1:12 PM, Brett Cannon  wrote:
> > On Wed, Sep 29, 2010 at 12:03, Guido van Rossum  wrote:
> >> A problem with that is that we regularly make matching improvements to
> >> upload.py and the server-side code it talks to. While we tend to be
> >> conservative in these changes (because we don't control what version
> >> of upload.py people use) it would be a pain to maintain backwards
> >> compatibility with a version that was distributed in Misc/ two years
> >> ago -- that's kind of outside our horizon.
> >
> > Well, I would assume people are working from a checkout. Patches from
> > an outdated checkout simply would fail and that's fine by me.
> 
> Ok, but that's an extra barrier for contributions. Lots of people when
> asked for a patch just modify their distro in place and you can count
> yourself lucky if they send you a diff from a clean copy.
> 
> But maybe with Hg it's less of a burden to ask people to use a checkout.
> 
> > How often do we even get patches generated from a downloaded copy of
> > Python? Is it enough to need to worry about this?
> 
> I used to get these frequently. I don't know what the experience of
> the current crop of core developers is though, so maybe my gut
> feelings here are outdated.
> 
When helping out on a Linux distribution, dealing with patches against the
latest tarball is a fairly frequent occurrence.  The question would be
whether these patches get filtered through the maintainer of the package
before landing in roundup/rietveld and whether the distro maintainer is
sufficiently in tune with python development that they're maintaining both
patches against the last tarball and a checkout of trunk with the patches
applied intelligently there.

A few other random thoughts:

* hg could be more of a burden in that it may be unfamiliar to the casual
  python user who happens to have found a fix for a bug and wants to submit
  it.  cvs and svn are similar enough that people comfortable with one are
  usually comfortable with the other but hg has different semantics.
* The barrier to entry seems to be higher the less well integrated the tools
  are.  I occassionally try to contribute patches to bzr in launchpad and
  the integration there is horrid.  You end up with two separate streams of
  comments and you don't automatically get subscribed to both.  There's
  several UI elements for associating a branch with a bug but some of them
  are buggy (or else are very strict on what input they're expecting) while
  other ones are hard to find.  Since I only contribute a patch two or three
  times a year, I have to re-figure out the process each time I try to
  contribute.
* I like the idea of patch complexity being a measure of whether the patch
  needs to go into a code review tool in that it keeps simple things simple
  and gives more advanced tools to more advanced cases.  I dislike it in
  that for someone who's just contributing a patch to fix a problem that
  they're encountering which happens to be somewhat complex, they end up
  having to learn a lot about tools that they may never use again.
* It seems like code review will be a great aid to people who submit changes
  or review changes frequently.  The trick will be making it
  non-intimidating for someone who's just going to contribute changes
  infrequently.

-Toshio

pgpaYtl9m5J7d.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils2 scripts

2010-10-08 Thread Toshio Kuratomi

On Fri, Oct 08, 2010 at 10:26:36AM -0400, Barry Warsaw wrote:
> On Oct 08, 2010, at 03:22 PM, Tarek Ziadé wrote:
> 
> >Yes that what I was thinking about -- I am not too worried about this,
> >since every Linux  deals with the 'more than one python installed'
> >case.
> 
> Kind of.   but anyway...
> 
> >> I'm in favor of add a top-level setup module that can be invoked using
> >> "python -m setup ...".  There will be three cases:
> >
> >Nice idea ! I wouldn't call it setup though, since it does many other
> >things. I can't think of a good name yet, but I'd like such a script
> >to express the idea that it can be used to:
> 
> I like 'python -m setup' too.  It's a small step from the familiar thing
> (python setup.py) to the new and shiny thing, without being confusing.  And
> you won't have to worry about things like version numbers because the Python
> executable will already have that baked in.
> 
> >- query pypi
> >- browse what's installed
> >- install/remove projects
> >- create releases and upload them
> >
> >pkg_manager ?
> 
> No underscores, please. :)
> 
> Actually, a decent wrapper script could just be called 'setup'.  My
> command-not-found on Ubuntu doesn't find a collision, or even close
> similarities.
> 
Simple English names like this are almost never a good idea for commands.
A quick google for "/usr/bin/setup" finds that Fedora-derived distros have
a /usr/bin/setup as a wrapper for all the text-mode configuration tools.
And there's a derivative of opensolaris that has a /usr/bin/setup for
configuring the system the first time.

> I still like 'egg' as a command too.  There are no collisions that I can see.
> I know this has been thrown around for years, and it's always been rejected
> because I think setuptools wanted to claim it, but since it still doesn't
> exist afaict, distutils2 could easily use it.
> 
There's a 2D graphics library that provides a /usr/bin/egg command:
  http://www.ir.isas.jaxa.jp/~cyamauch/eggx_procall/
Latest Stable Version 0.93r3 (released 2010/4/14)

In the larger universe of programs, it might make for more intuitive
remembering of the command to use a prefix (either py or python) though.

python-setup  is a lot like python setup.py
pysetup is shorter
pyegg is even shorter :-)

-Toshio


pgpVyH77xDEyw.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils2 scripts

2010-10-08 Thread Toshio Kuratomi

On Fri, Oct 08, 2010 at 05:12:44PM +0200, Antoine Pitrou wrote:
> On Fri, 8 Oct 2010 11:04:35 -0400
> Toshio Kuratomi  wrote:
> > 
> > In the larger universe of programs, it might make for more intuitive
> > remembering of the command to use a prefix (either py or python) though.
> > 
> > python-setup  is a lot like python setup.py
> > pysetup is shorter
> > pyegg is even shorter :-)
> 
> Wouldn't "quiche" be a better alternative for "pyegg"?
> 
I won't bikeshed as long as we stay away from conflicting names.

-Toshio


pgpk9LAmigC2q.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] My work on Python3 and non-ascii paths is done

2010-10-21 Thread Toshio Kuratomi

On Thu, Oct 21, 2010 at 12:00:40PM -0400, Barry Warsaw wrote:
> On Oct 20, 2010, at 02:11 AM, Victor Stinner wrote:
> 
> >I plan to fix Python documentation: specify the encoding used to decode all 
> >byte string arguments of the C API. I already wrote a draft patch: issue 
> >#9738. This lack of documentation was a big problem for me, because I had to 
> >follow the function calls to get the encoding.
> 
This will be truly excellent!

> That's exactly what I was looking for!  Thanks.  I think you've learned a huge
> amount of good information that's difficult to find, so writing it up in a
> more permanent and easy to find location will really help future Python
> developers!
> 
One further thing I'd be interested in is if you could document any best
practices from this experience.  Things like, "surrogateescape is a good/bad
default in these cases",  When is parallel functions for bytes and str
better than a single polymorphic function?  That way when other modules are
added to the stdlib, things can be more consistent.

-Toshio


pgp6M2nRKwOkl.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Continuing 2.x

2010-10-29 Thread Toshio Kuratomi

On Fri, Oct 29, 2010 at 11:12:28AM -0700, geremy condra wrote:
> On Thu, Oct 28, 2010 at 11:55 PM, Glyph Lefkowitz
> > Let's take PyPI numbers as a proxy.  There are ~8000 packages with a
> > "Programming Language::Python" classifier.  There are ~250 with "Programming
> > Langauge::Python::3".  Roughly speaking, we can say that is 3% of Python
> > code which has been ported so far.  Python 3.0 was released at the end of
> > 2008, so people have had roughly 2 years to port, which comes up with 1.5%
> > per year.
> Just my two cents:
> 
Just one further informational note about using pypi in this way for
statistics... In the porting work we've done within Fedora, I've noticed
that a lot of packages are python3 ready or even officially support python3
but the language classifier on pypi does not reflect this.  Here's just
a few since I looked them up when working on the python porting wiki pages:

http://pypi.python.org/pypi/Beaker/
http://pypi.python.org/pypi/pycairo
http://pypi.python.org/pypi/docutils

-Toshio


pgphZAiUVGy6C.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-08 Thread Toshio Kuratomi

On Tue, Nov 09, 2010 at 11:46:59AM +1100, Ben Finney wrote:
> Ron Adam  writes:
> 
> > def _publicly_documented_private_api():
> > """  Not sure why you would want to do this
> >  instead of using comments.
> > """
> > ...
> 
> Because the docstring is available at the interpreter via ‘help()’, and
> because it's automatically available to ‘doctest’, and most of the other
> good reasons for docstrings.
> 
> > The _publicly_documented_private_api() is a problem because people
> > *will* use it even though it has a leading underscore. Especially
> > those who are new to python.
> 
> That isn't an argument against docstrings, since the problem you
> describe isn't dependent on the presence or absence of docstrings.
> 
Just wanted to expand a bit here:  as a general practice, you may be
involved in a project where the _private_api() is not intended by people
outside of the project but is intended to be used in multiple places within
the project.  If you have different people working on those different areas,
it can be very useful for them to be able to use help(_private_api) on the
other functions from within the interpreter shell.

-Toshio


pgpG39YJbm42M.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-09 Thread Toshio Kuratomi

On Tue, Nov 09, 2010 at 01:49:01PM -0500, Tres Seaver wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 11/08/2010 06:26 PM, Bobby Impollonia wrote:
> 
> > This does hurt because anyone who was relying on "import *" to get a
> > name which is now omitted from __all__ is going to upgrade and find
> > their program failing with NameErrors. This is a backwards compatible
> > change and shouldn't happen without a deprecation warning first.
> 
> Outside an interactive prompt, anyone using "from foo import *" has set
> themselves and their users up to lose anyway.
> 
> That syntax is the single worst misfeature in all of Python.  It impairs
> readability and discoverability for *no* benefit beyond one-time typing
> convenience.  Module writers who compound the error by expecting to be
> imported this way, thereby bogarting the global namespace for their own
> purposes, should be fish-slapped. ;)
> 
I think there's a valid case for bogarting the namespace in this instance,
but let me know if there's a better way to do it::

# Method to use system libraries if available, otherwise use a bundled copy,
# aka: make both system packagers and developers happy::


Relevant directories and files for this module::

+ foo/
+- __init__.py
++ compat/
 +- __init__.py
 ++ bar/
  +- __init__.py
  +- _bar.py

foo/compat/bar/_bar.py is a bundled module.

foo/compat/bar/__init__.py has:

try:
from bar import *
from bar import __all__
except ImportError::
from foo.compat.bar._bar import *
from foo.compat.bar._bar import __all__

-Toshio


pgp2MughtFdu4.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Porting Ideas

2010-12-01 Thread Toshio Kuratomi

On Wed, Dec 01, 2010 at 10:06:24PM -0500, Alexander Belopolsky wrote:
> On Wed, Dec 1, 2010 at 9:53 PM, Terry Reedy  wrote:
> ..
> > Does Sphinx run on PY3 yet?
> 
> It does, but see issue10224 for details.
> 
>  http://bugs.python.org/issue10224
>
Also, docutils has an unported module.

/me needs to write a bug report for that as he really doesn't have the time
he thought he did to perform the port.

-Toshio


pgplgIh22rxh1.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 384 accepted

2010-12-04 Thread Toshio Kuratomi

On Fri, Dec 03, 2010 at 11:52:41PM +0100, "Martin v. Löwis" wrote:
> Am 03.12.2010 23:48, schrieb Éric Araujo:
> >> But I'm not interested at all in having it in distutils2. I want the
> >> Python build itself to use it, and alas, I can't because of the freeze.
> > You can’t in 3.2, true.  Neither can you in 3.1, or any previous
> > version.  If you implement it in distutils2, you have very good chances
> > to get it for 3.3.  Isn’t that a win?
> 
> It is, unfortunately, a very weak promise. Until distutils2 is
> integrated in Python, I probably won't spend any time on it.
> 
At the language summit it was proposed and seemed generally accepted (maybe
I took silence as consent... it's been almost a year now) that bold new
modules (and bold rewrites of existing modules since it fell out of the
distutils/2 discussion) should get implemented in a module on pypi before
being merged into the python stdlib.  If you wouldn't want to work on any of
those modules until they were actually integrated into Python, it sounds
like you disagree with that as a general practice?

-Toshio


pgpBIM4lN9FET.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 >

1 - 100 of 170 matches

Mail list logo