from:"James Y Knight"

Re: [Python-Dev] Integrate BeautifulSoup into stdlib?

2009-03-04 Thread James Y Knight



On Mar 4, 2009, at 9:56 AM, Chris Withers wrote:


Vaibhav Mallya wrote:
We do have HTMLParser, but that doesn't handle malformed pages  
well, and just isn't as nice as BeautifulSoup.


Interesting, given that BeautifulSoup is built on HTMLParser ;-)


I think html5lib would be a better candidate for an imrpoved HTML  
parser in the stdlib than BeautifulSoup.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Formatting mini-language suggestion

2009-03-11 Thread James Y Knight



On Mar 11, 2009, at 9:06 PM, Nick Coghlan wrote:


Raymond Hettinger wrote:

The current formatting mini-language provisions left/right/center
alignment, prefixes for 0b 0x 0o, and rules on when to show the
plus-sign.  I think it would be far more useful to provision a simple
way of specifying a thousands separator.

Financial users in particular find the locale approach to be  
frustrating
and non-obvious.  Putting in a thousands separator is a common task  
for

output destined to be read by non-programmers.


+1 for the general idea.

A specific syntax proposal:

 [[fill]align][sign][#][0][minimumwidth][,sep][.precision][type]

'sep' is the new field that defines the thousands separator. It  
appears
immediately before the precision specifier and starts with a leading  
comma.


I believe this syntax is unambiguous and backwards compatible because
the only other place a comma might appear (the fill field) is required
to be followed by an alignment character.


You might be interested to know that in India, the commas don't come  
every 3 digits. In india, they come every two digits, after the first  
three. Thus one billion = 1,00,00,00,000. How are you gonna represent  
*that* in a formatting mini-language? :)


See also http://en.wikipedia.org/wiki/Indian_numbering_system

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Formatting mini-language suggestion

2009-03-11 Thread James Y Knight


On Mar 11, 2009, at 11:40 PM, Nick Coghlan wrote:

Raymond Hettinger wrote:

It is not the goal to replace locale or to accomodate every
possible convention.  The goal is to make a common task easier
for many users.  The current, default use of the period as a decimal
point has not proven to be problem eventhough that convention is
not universal.   For a thousands separator, a comma is a decent  
choice

that makes it easy follow-on with s.replace(',', '_') or somesuch.


In that case, I would simplify my suggestion to:

 [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

Addition to mini language documentation:
 The ',' option indicates that commas should be included in the
output as a thousands separator. As with locales which do not use a
period as the decimal point, locales which use a different convention
for digit separation will need to use the locale module to obtain
appropriate formatting.



This proposal has the advantage that you're not overly specifying the  
behavior in the format string itself.


That is: the "," option is really just indicating "please insert  
separators". With the current locale-ignorant implementation, that'd  
just mean "a comma every 3 digits". But it leaves the door open for a  
locale-sensitive variant of the format to be added in the future  
without conflicting with the instructions in the format string. (as  
the ability to specify an arbitrary character, or the ability to  
specify a comma instead of a period for the decimal point would).


I'm not against Raymond's proposal, just against doing a *bad* job of  
making it work in multiple locales. Locale conventions can be complex,  
and are going to be best represented outside the format string.


(BTW: single quote is used by printf for the grouping flag rather than  
comma)


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Possible py3k io wierdness

2009-04-05 Thread James Y Knight



On Apr 5, 2009, at 6:29 AM, Antoine Pitrou wrote:


Brian Quinlan  sweetapp.com> writes:


I don't see why this is helpful. Could you explain why
_RawIOBase.close() calling self.flush() is useful?


I could not explain it for sure since I didn't write the Python  
version.
I suppose it's so that people who only override flush()  
automatically get the

flush-on-close behaviour.


It seems that a separate method "_internal_close" should've been  
defined to do the actual closing of the file, and the close() method  
should've been defined on the base class as "self.flush();  
self._internal_close()" and never overridden.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Dropping bytes "support" in json

2009-04-10 Thread James Y Knight


On Apr 9, 2009, at 10:38 PM, Barry Warsaw wrote:
So, what I'm really asking is this.  Let's say you agree that there  
are use cases for accessing a header value as either the raw encoded  
bytes or the decoded unicode.


As I said in the thread having nearly the same exact discussion on web- 
sig, except about WSGI headers...



What should this return:

>>> message['Subject']

The raw bytes or the decoded unicode?


Until you write a parser for every header, you simply cannot decode to  
unicode. The only sane choices are:

1) raw bytes
2) parsed structured data

There's no "decoded to unicode but not parsed" option: that's doing  
things in the wrong order. If you RFC2047-decode the header before  
doing tokenization and parsing, you will just have a *broken*  
implementation.


Here's an example where it matters. If you decode the RFC2047 part  
before parsing, you'd decide that there's two recipients to the  
message. There aren't. ", " is the display-name of  
"act...@example.com", not a second recipient.


  To: =?UTF-8?B?PGJyb2tlbkBleGFtcGxlLmNvbT4sIA==?= 

Here's a quote from RFC2047:
NOTE: Decoding and display of encoded-words occurs *after* a  
structured field body is parsed into tokens. It is therefore  
possible to hide 'special' characters in encoded-words which, when  
displayed, will be indistinguishable from 'special' characters in  
the surrounding text. For this and other reasons, it is NOT  
generally possible to translate a message header containing 'encoded- 
word's to an unencoded form which can be parsed by an RFC 822 mail  
reader.

And another quote for good measure:
(2) Any header field not defined as '*text' should be parsed  
according to the syntax rules for that header field. However, any  
'word' that appears within a 'phrase' should be treated as an  
'encoded-word' if it meets the syntax rules in section 2. Otherwise  
it should be treated as an ordinary 'word'.



Now, I suppose there's also a third possibility:
3) US-ASCII-only strings, unmolested except for doing  
a .decode('ascii'). That'll give you a string all right, but it's  
really just cheating. It's not actually a text string in any  
meaningful sense.


(in all this I'm assuming your question is not about the "Subject"  
header in particular; that is of course just unstructured text so the  
parse step doesn't actually do anything...).


James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Dropping bytes "support" in json

2009-04-13 Thread James Y Knight


On Apr 13, 2009, at 10:11 AM, Barry Warsaw wrote:
The email package does not need a parser for every header, but it  
should provide a framework that applications (or third party  
libraries) can use to extend the built-in header parsers.  A bare  
minimum for functionality requires a Content-Type parser.  I think  
the email package should also include an address header (Originator,  
Destination) parser, and a Message-ID header parser.  Possibly others.


Sure, that's fine...

The default would probably be some unstructured parser for headers  
like Subject.



But for unknown headers, it's not a useful choice to return a "str"  
object. "str" is just one possible structured data representation for  
a header: there's no correct useful decoding of all headers into str.  
Of course for the "Subject" header, str is the correct result type,  
but that's not a default, that's explicit support for "Subject". You  
can't correctly decode "To" into a str, so what makes you think you  
can decode "X-Gabazaborph" into str?


The only useful and correct representation for unknown (or  
unimplemented) headers is the raw bytes.


James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread James Y Knight



On Apr 15, 2009, at 12:15 PM, M.-A. Lemburg wrote:

The much more common use case is that of wanting to have a base  
package

installation which optional add-ons that live in the same logical
package namespace.

The PEP provides a way to solve this use case by giving both  
developers

and users a standard at hand which they can follow without having to
rely on some non-standard helpers and across Python implementations.


I'm not sure I understand what advantage your proposal gives over the  
current mechanism for doing this.


That is, add to your __init__.py file:

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

Can you describe the intended advantages over the status-quo a bit  
more clearly?


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread James Y Knight


On Apr 16, 2009, at 5:47 PM, Antoine Pitrou wrote:
IMHO, the question is rather what the use case is for the behaviour  
you are
proposing. In which kind of situation is it acceptable to turn 31/2  
silently

into 29/2?


Essentially any situation in which you'd actually want a "next month"  
operation it's acceptable to do that.


It's a human-interface operation, and as such, everyone (ahem) "knows  
what it means" to say "2 months from now", but the details don't  
usually have to be thought about too much. Of course when you have a  
computer program, you actually need to tell it what you really mean.


I do a fair amount of date calculating, and use two different kinds of  
"add-month":


Option 1)
Add n to the month number, truncate day number to fit the month you  
end up in.


Option 2)
As above, but with the additional caveat that if the original date is  
the last day of its month, the new day should also be the last day of  
the new month. That is:

April 30th + 1 month = May 31st, instead of May 30th.

They're both useful behaviors, in different circumstances.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-23 Thread James Y Knight



On Apr 22, 2009, at 2:50 AM, Martin v. Löwis wrote:


I'm proposing the following PEP for inclusion into Python 3.1.
Please comment.


+1. Even if some people still want a low-level bytes API, it's  
important that the easy case be easy. That is: the majority of Python  
applications should *just work, damnit* even with not-properly-encoded- 
in-current-LC_CTYPE filenames. It looks like this proposal  
accomplishes that, and does so in a relatively nice fashion.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-24 Thread James Y Knight


On Apr 24, 2009, at 8:00 AM, Paul Moore wrote:

However, it *does* agree with the reality of Windows file systems. The
fundamental problem here is that there is a strong OS disparity - for
Windows, the OS uses Unicode, for POSIX, the OS uses bytes.


It's unfortunately the case that this isn't *precisely* true. Windows  
uses arbitrary 16-bit sequences, just as unix uses arbitrary 8-bit  
sequences. Neither one is required by the operating system to be a  
proper unicode encoding. The main difference is that there is already  
a widely accepted way to decode a improperly-encoded 16-bit-sequence  
with the utf-16 codec: simply leave the lone surrogate pairs in place.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-24 Thread James Y Knight


On Apr 24, 2009, at 6:05 PM, Paul Moore wrote:

- Windows systems where broken Unicode (lone surrogates or whatever)
isn't involved
- Unix systems where the user's stated filesystem encoding is correct

Can you honestly say that this isn't the vast majority of real-world
environments? (IIRC, you are based in Japan, so it may well be true
that the likelihood of problems is a lot higher where you are than
where I am - the UK - but I suspect that averaging out, things are
generally as above).


In my experience, it is normal on most unix systems that some programs  
(mostly daemons) are running in default "POSIX" locale, others (most  
user programs) are running in the "en_US.utf-8" locale, and some  
luddite users have set themselves to "en_US.8859-1". All running on  
the same system.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-27 Thread James Y Knight



On Apr 27, 2009, at 11:35 PM, Martin v. Löwis wrote:

No. You seem to assume that all bytes < 128 decode successfully  
always.

I believe this assumption is wrong, in general:

py> "\x1b$B' \x1b(B".decode("iso-2022-jp") #2.x syntax
Traceback (most recent call last):
 File "", line 1, in 
UnicodeDecodeError: 'iso2022_jp' codec can't decode bytes in position
3-4: illegal multibyte sequence

All bytes are below 128, yet it fails to decode.


Surely nobody uses iso2022 as an LC_CTYPE encoding. That's expressly  
forbidden by POSIX, if I'm not mistaken...and I can't see how it would  
work, considering that it uses all the bytes from 0x20-0x7f, including  
0x2f ("/"), to represent non-ascii characters.


Hopefully it can be assumed that your locale encoding really is a non- 
overlapping superset of ASCII, as is required by POSIX...


I'm a bit scared at the prospect that U+DCAF could turn into "/", that  
just screams security vulnerability to me.  So I'd like to propose  
that only 0x80-0xFF <-> U+DC80-U+DCFF should ever be allowed to be  
encoded/decoded via the error handler.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-28 Thread James Y Knight



On Apr 28, 2009, at 2:50 AM, Martin v. Löwis wrote:


James Y Knight wrote:

Hopefully it can be assumed that your locale encoding really is a
non-overlapping superset of ASCII, as is required by POSIX...


Can you please point to the part of the POSIX spec that says that
such overlapping is forbidden?


I can't find it...I would've thought it would be on this page:
http://opengroup.org/onlinepubs/007908775/xbd/charset.html
but it's not (at least, not obviously). That does say (effectively)  
that all encodings must be supersets of ASCII and use the same  
codepoints, though.


However, ISO-2022 being inappropriate for LC_CTYPE usage is the entire  
reason why EUC-JP was created, so I'm pretty sure that it is in fact  
inappropriate, and I cannot find any evidence of it ever being used on  
any system.


From http://en.wikipedia.org/wiki/EUC-JP:
"To get the EUC form of an ISO-2022 character, the most significant  
bit of each 7-bit byte of the original ISO 2022 codes is set (by  
adding 128 to each of these original 7-bit codes); this allows  
software to easily distinguish whether a particular byte in a  
character string belongs to the ISO-646 code or the ISO-2022 (EUC)  
code."


Also:
http://www.cl.cam.ac.uk/~mgk25/ucs/iso2022-wc.html


I'm a bit scared at the prospect that U+DCAF could turn into "/",  
that
just screams security vulnerability to me.  So I'd like to propose  
that

only 0x80-0xFF <-> U+DC80-U+DCFF should ever be allowed to be
encoded/decoded via the error handler.


It would be actually U+DC2f that would turn into /.


Yes, I meant to say DC2F, sorry for the confusion.


I'm happy to exclude that range from the mapping if POSIX really
requires an encoding not to be overlapping with ASCII.


I think it has to be excluded from mapping in order to not introduce  
security issues.


However...

There's also SHIFT-JIS to worry about...which apparently some people  
actually want to use as their default encoding, despite it being  
broken to do so. RedHat apparently refuses to provide it as a locale  
charset (due to its brokenness), and it's also not available by  
default on my Debian system. People do unfortunately seem to actually  
use it in real life.


https://bugzilla.redhat.com/show_bug.cgi?id=136290

So, I'd like to propose this:
The "python-escape" error handler when given a non-decodable byte from  
0x80 to 0xFF will produce values of U+DC80 to U+DCFF. When given a non- 
decodable byte from 0x00 to 0x7F, it will be converted to U+-U 
+007F. On the encoding side, values from U+DC80 to U+DCFF are encoded  
into 0x80 to 0xFF, and all other characters are treated in whatever  
way the encoding would normally treat them.


This proposal obviously works for all non-overlapping ASCII supersets,  
where 0x00 to 0x7F always decode to U+00 to U+7F. But it also works  
for Shift-JIS and other similar ASCII-supersets with overlaps in  
trailing bytes of a multibyte sequence. So, a sequence like  
"\x81\xFD".decode("shift-jis", "python-escape") will turn into  
u"\uDC81\u00fd". Which will then properly encode back into "\x81\xFD".


The character sets this *doesn't* work for are: ebcdic code pages  
(obviously completely unsuitable for a locale encoding on unix),  
iso2022-* (covered above), and shift-jisx0213 (because it has replaced  
\ with yen, and - with overline).


If it's desirable to work with shift_jisx0213, a modification of the  
proposal can be made: Change the second sentence to: "When given a non- 
decodable byte from 0x00 to 0x7F, that byte must be the second or  
later byte in a multibyte sequence. In such a case, the error handler  
will produce the encoding of that byte if it was standing alone (thus  
in most encodings, \x00-\x7f turn into U+00-U+7F)."


It sounds from https://bugzilla.novell.com/show_bug.cgi?id=162501 like  
some people do actually use shift_jisx0213, unfortunately.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-30 Thread James Y Knight


On Apr 30, 2009, at 5:42 AM, Martin v. Löwis wrote:

I think you are right. I have now excluded ASCII bytes from being
mapped, effectively not supporting any encodings that are not ASCII
compatible. Does that sound ok?


Yes. The practical upshot of this is that users who brokenly use  
"ja_JP.SJIS" as their locale (which, note, first requires editing some  
files in /var/lib/locales manually to enable its use..) may still have  
python not work with invalid-in-shift-jis filenames. Since that locale  
is widely recognized as a bad idea to use, and is not supported by any  
distros, it certainly doesn't bother me that it isn't 100% supported  
in python. It seems like the most common reason why people want to use  
SJIS is to make old pre-unicode apps work right in WINE -- in which  
case it doesn't actually affect unix python at all.


I'd personally be fine with python just declaring that the filesystem- 
encoding will *always* be utf-8b and ignore the locale...but I expect  
some other people might complain about that. Of course, application  
authors can decide to do that themselves by calling  
sys.setfilesystemencoding('utf-8b') at the start of their program.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383 and GUI libraries

2009-05-01 Thread James Y Knight


On May 1, 2009, at 9:42 PM, Zooko O'Whielacronx wrote:

Yep, I reversed the order of encode() and decode().  However, my whole
statement was utterly wrong and shows that I still didn't fully get it
yet.  I have flip-flopped again and currently think that PEP 383 is
useless for this use case and that my original plan [1] is still the
way to go.  Please let me know if you spot a flaw in my plan or a
ridiculousity in my requirements, or if you see a way that PEP 383 can
help me.


If I were designing a new system such as this, I'd probably just go  
for utf8b *always*. That is, set the filesystem encoding to utf-8b.  
The end. All files always keep the same bytes transferring between  
unix systems. Thus, for the 99% of the world that uses either windows  
or a utf-8 locale, they get useful filenames inside tahoe. The other  
1% of the world that uses something like latin-1, EUC_JP, etc. on  
their local system sees mojibake filenames in tahoe, but will see the  
same filename that they put in when they take it back out.


Gnome already uses only utf-8 for filename displays for a few years  
now, for example, so this isn't exactly an unheard-of position to  
take...


But if you don't do that, then, I still don't see what purpose your  
requirements serve. If I have two systems: one with a UTF-8 locale,  
and one with a Latin-1 locale, why should transmitting filenames from  
system 1 to system 2 through tahoe preserve the raw bytes, but doing  
the reverse *not* preserve the raw bytes? (all byte-sequences are  
valid in latin-1, remember, so they'll all decode into unicode without  
error, and then be reencoded in utf-8...). This seems rather a useless  
behavior to me.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-06 Thread James Y Knight


On May 6, 2009, at 5:39 AM, Stephen J. Turnbull wrote:

Now, with Python's file system encoding == UTF-8 or any packed EUC,
and more than a handful of Shift JIS or Big5 characters in file names,
one is *almost certain* to encounter ASCII as the second byte of a
multibyte sequence.  PEP 383 can't handle this


Hm, I haven't tried the implementation, but I thought that what would  
happen is:
'\x85a'.decode('utf-8', 'utf8b/surrogate-replace/whateveritscalled') - 
> u'\uDC85a'


If that indeed doesn't happen, that's certainly a defect and should be  
remedied.



, but it is sure to be
the most common use case for PEP 383 in East Asia.


Yes.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 384: Defining a Stable ABI

2009-05-17 Thread James Y Knight



On May 17, 2009, at 4:54 PM, Martin v. Löwis wrote:

Currently, each feature release introduces a new name for the
Python DLL on Windows, and may cause incompatibilities for extension
modules on Unix. This PEP proposes to define a stable set of API
functions which are guaranteed to be available for the lifetime
of Python 3, and which will also remain binary-compatible across
versions. Extension modules and applications embedding Python
can work with different feature releases as long as they restrict
themselves to this stable ABI.



It seems like a good ideal to strive for.

But I think this is too strong a promise. IMO it would be better to  
say that ABI compatibility across releases is a goal. If someone does  
make a change that breaks the ABI, I'd expect whomever is proposing it  
to put forth a fairly strong argument towards why it's a worthwhile  
change. But it should be possible and allowed, given the right  
circumstances. Because I think it's pretty much inevitable that it  
*will* need to happen, sometime.


(of course there will need to be ABI tests, so that any potential ABI  
breakages are known about when they occur)


Python is much more defined by its source language than its C  
extension API, so tying the python major version number to the C ABI  
might not be the best idea from a "marketing" standpoint. (I can see  
it now..."Python 4.0 major new features: we changed the C method  
definition struct layout incompatibly" :)


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [unladen-swallow] PEP 384: Defining a Stable ABI

2009-05-20 Thread James Y Knight


On May 20, 2009, at 4:07 PM, Nick Coghlan wrote:

Forcing developers to choose between the speed of the INCREF/DECREF
macros and the proposed ABI compatibility mode for the benefit of an  
as

yet hypothetical GIL-less CPython API implementation seems more like a
way to kill adoption of the ABI compatibility mode rather than a way  
to

encourage the use of the IncRef/Decref functions.


Indeed, and if the promise of "no-ABI-breakages-till-4.0" is removed,  
this would be a non-issue. Keep Py_INCREF macros in the current ABI,  
and then break the ABI when someone wants to remove the GIL someday.  
That's certainly going to be a big enough change to justify changing  
the ABI.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Migration strategy for new-style string formatting [Was: Binary Operator for New-Style String Formatting]

2009-06-22 Thread James Y Knight


On Jun 21, 2009, at 5:40 PM, Eric Smith wrote:
I've basically come to accept that %-formatting can never go away,  
unfortunately. There are too many places where %-formatting is used,  
for example in logging Formatters. %-formatting either has to exist  
or it has to be emulated.


It'd possibly be helpful if there were builtin objects which forced  
the format style to be either newstyle or oldstyle, independent of  
whether % or format was called on it.


E.g.
x = newstyle_formatstr("{} {} {}")
x % (1,2,3) == x.format(1,2,3) == "1 2 3"

and perhaps, for symmetry:
y = oldstyle_formatstr("%s %s %s")
y.format(1,2,3) == x % (1,2,3) == "1 2 3"

This allows the format string "style" decision is to be made external  
to the API actually calling the formatting function. Thus, it need not  
matter as much whether the logging API uses % or .format() internally  
-- that only affects the *default* behavior when a bare string is  
passed in.


This could allow for a controlled staged towards the new format string  
format, with a long deprecation period for users to migrate:


1) introduce the above feature, and recommend in docs that people only  
ever use new-style format strings, wrapping the string in  
newstyle_formatstr() when necessary for passing to an API which uses %  
internally.
2) A long time later...deprecate str.__mod__; don't deprecate  
newstyle_formatstr.__mod__.
3) A while after that (maybe), remove str.__mod__ and replace all  
calls in Python to % (used as a formatting operator) with .format() so  
that the default is to use newstyle format strings for all APIs from  
then on.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove site-packages?!? [was: [Distutils] PEP 376 - from pythonpkgmgr's point of view]

2009-07-21 Thread James Y Knight


On Jul 21, 2009, at 7:38 PM, David Lyon wrote:
When I go into python on ubuntu I see there is /usr/local/pythonX.X/ 
lib/

site-packages and I'm wondering why the hubba setuptools/distutils
doesn't put packages there by default. That would solve a lot of
problems.

Just leave /usr/lib/pythonX.X//lib/site-packages to the O/S.


Uh guys, I'm not sure if anyone here noticed, but Debian and Ubuntu  
have switched to install their distribution-supplied python libraries  
into:

/usr/lib/pythonX.Y/lib/dist-packages
and distutils by default will install into
/usr/local/lib/pythonX.Y/dist-packages

starting with python 2.6.

See:
http://lists.debian.org/debian-devel/2009/02/msg00431.html

Since that email says "Discussed this with Barry Warsaw and Martin v.  
Loewis", I'd assume this change would be more widely known in the  
distutils/python-dev community, but apparently not??


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove site-packages?!? [was: [Distutils] PEP 376 - from pythonpkgmgr's point of view]

2009-07-22 Thread James Y Knight



On Jul 22, 2009, at 4:49 AM, M.-A. Lemburg wrote:


Debian has a long history of doing this different, so it's
not much of a surprise. They also apply such changes to
Python packages.

However, all of this is non-standard and will cause problems
with tools that rely on the standard site-packages/ location. Such
changes should be discouraged.


And yet, the change seems to have some strong reasoning, solves the  
problem discussed in this thread, and was apparently discussed and  
approved of by some core python developers before being implemented.  
It seems a bit foolish to me to thus just dismiss it as "evil debian  
being different"...


If anything it seems like it's a failure of the Python project to make  
easily deployable software, compounded with a failure of communication  
within the python community.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] command line attachable debugger

2009-07-24 Thread James Y Knight



On Jul 24, 2009, at 1:31 AM, Edward Peschko wrote:


all,

I'I was wondering if there was a command line python debugger that was
able to attach to an existing process. I'd very much like to be able
to debug over a ssh session using screen.

Ed

(ps - and yes, I know about winpdb, etc... that is not exactly what
I'm looking for..)


Winpdb is *exactly* what you asked for, so if it's not what you're  
looking for you'll need to be more specific about what you want that  
it doesn't do...


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Decorator syntax

2009-09-02 Thread James Y Knight


On Sep 2, 2009, at 6:15 AM, Rob Cliffe wrote:

So - the syntax restriction seems not only inconsistent, but  
pointless; it doesn't forbid anything, but merely means we have to  
do it in a slightly convoluted (unPythonesque) way.  So please,  
Guido, will you reconsider?


Indeed, it's a silly inconsistent restriction. When it was first added  
I too suggested that any expression be allowed after the @, rather  
than having a uniquely special restricted syntax. I argued from  
consistency of grammar standpoint. But Guido was not persuaded. Good  
luck to you. :)


Here's some of the more relevant messages from the thread back when  
the @decorator feature was first introduced:

http://mail.python.org/pipermail/python-dev/2004-August/046654.html
http://mail.python.org/pipermail/python-dev/2004-August/046659.html
http://mail.python.org/pipermail/python-dev/2004-August/046675.html
http://mail.python.org/pipermail/python-dev/2004-August/046711.html
http://mail.python.org/pipermail/python-dev/2004-August/046741.html
http://mail.python.org/pipermail/python-dev/2004-August/046753.html
http://mail.python.org/pipermail/python-dev/2004-August/046818.html

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fuzziness in io module specs

2009-09-18 Thread James Y Knight



On Sep 18, 2009, at 3:55 PM, MRAB wrote:


I think that this should be an invariant:

   0 <= file pointer <= file size

so the file pointer might sometimes have to be moved.



As for the question of whether 'truncate' should be able to lengthen a
file, the method name suggests no; if the method name were 'resize',  
for

example, then maybe yes, zeroing the new bytes for security.



Why are you just making things up? There is a *vast* amount of  
precedent for how file operations should work. Python should follow  
that precedent and do like POSIX unless there's a compelling reason  
not to. Quoting:


   If  fildes  refers  to  a  regular  file,  the ftruncate()  
function shall cause the size of the file to be truncated to
   length. If the size of the file previously exceeded length,  
the extra data shall no longer be available to reads on the
   file.  If  the  file  previously  was smaller than this size,  
ftruncate() shall either increase the size of the file or
   fail.   XSI-conformant systems shall increase the size of the  
file.  If the file size is increased, the  extended  area
   shall appear as if it were zero-filled. The value of the seek  
pointer shall not be modified by a call to ftruncate().


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] POSIX [Fuzziness in io module specs]

2009-09-18 Thread James Y Knight


On Sep 18, 2009, at 8:58 PM, Antoine Pitrou wrote:

I'm not sure that's true. Various Unix/Linux man pages are readily
available on the Internet, but they regard specific implementations,
which often depart from the spec in one way or another. POSIX specs
themselves don't seem to be easily reachable; you might even have to  
pay

for them.



The POSIX specs are quite easily accessible, without payment.

I got my quote by doing:
man 3p ftruncate

I had previously done:
apt-get install manpages-posix-dev
to install the posix manpages. That package contains the POSIX  
standard as of 2003. Which is good enough for most uses. It seems to  
be available here, if you don't have a debian system:

http://www.kernel.org/pub/linux/docs/man-pages/man-pages-posix/

There's also a webpage, containing the official POSIX 2008 standard:
   http://www.opengroup.org/onlinepubs/9699919799/

And to navigate to ftruncate from there, click "System Interfaces" in  
the left pane, "System Interfaces" in the bottom pane, and then  
"ftruncate" in the bottom pane.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] IO module precisions and exception hierarchy

2009-09-27 Thread James Y Knight


On Sep 27, 2009, at 4:20 AM, Pascal Chambon wrote:
Thus, at the moment IOErrors rather have the semantic of "particular  
case of OSError", and it's kind of confusing to have them remain in  
their own separate tree... Furthermore, OSErrors are often used  
where IOErrors would perfectly fit, eg. in low level I/O functions  
of the OS module.
Since OSErrors and IOErrors are slightly mixed up when we deal with  
IO operations, maybe the easiest way to make it clearer would be to  
push to their limits already existing designs.


How about just making IOError = OSError, and introducing your proposed  
subclasses? Does the usage of IOError vs OSError have *any* useful  
semantics?


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3144 review.

2009-09-27 Thread James Y Knight



On Sep 27, 2009, at 3:18 PM, Peter Moody wrote:


administrators) would use it, but it's doable. what you're claiming is
that my use case is invalid.

that's what I claim is broken.


He's claiming your solution to address your use case is confusing, not  
that the use case is invalid.



I'm not going to make ipaddr
less useful (strictly removing functionality), more bulky and
confusing (adding more confusingly named classes and methods) or
otherwise break the library in a vain attempt to have it included in
the stdlib.


If I understand correctly, the proposal for addressing the issue is to  
make two rather simple changes:
1) if strict=False, mask off the bits described by the netmask when  
creating an IPNetwork, such that the host bits are always 0.

2) add a single new function:

def parse_net_and_addr(s):
  return (IPNetwork(s), IPAddress(s.split('/')[0]))

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] please consider changing --enable-unicode default to ucs4

2009-09-28 Thread James Y Knight


On Sep 28, 2009, at 4:25 AM, M.-A. Lemburg wrote:

Distributions should really not be put in charge of upstream
coding design decisions.


I don't think you can blame distros for this one

From PEP 0261:
It is also proposed that one day --enable-unicode will just
default to the width of your platforms wchar_t.

On linux, wchar_t is 4 bytes.

If there's a consensus amongst python upstream that all the distros  
should be shipping Python with UCS2 unicode strings, you should reach  
out to them and say this, in a rather more clear fashion. Currently,  
most signs point towards UCS4 builds as being the better option.


Or, one might reasonably wonder why UCS-4 is an option at all, if  
nobody should enable it.



People building their own Python version will usually also build
their own extensions, so I don't really believe that the above
scenario is very common.


I'd just like to note that I've run into this trap multiple times. I  
built a custom python, and expected it to work with all the existing,  
installed, extensions (same major version as the system install, just  
patched). And then had to build it again with UCS4, for it to actually  
work. Of course building twice isn't the end of the world, and I'm  
certainly used to having to twiddle build options on software to get  
it working, but, this *does* happen, and *is* a tiny bit irritating.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-09-29 Thread James Y Knight

I'm resending a message I sent in June, since it seems the same thread  
has come up again, and I don't believe anybody actually responded  
(positively or negatively) to the suggestion back then.


http://mail.python.org/pipermail/python-dev/2009-June/090176.html

On Jun 21, 2009, at 5:40 PM, Eric Smith wrote:
I've basically come to accept that %-formatting can never go away,  
unfortunately. There are too many places where %-formatting is used,  
for example in logging Formatters. %-formatting either has to exist  
or it has to be emulated.


It'd possibly be helpful if there were builtin objects which forced  
the format style to be either newstyle or oldstyle, independent of  
whether % or format was called on it.


E.g.
x = newstyle_formatstr("{} {} {}")
x % (1,2,3) == x.format(1,2,3) == "1 2 3"

and perhaps, for symmetry:
y = oldstyle_formatstr("%s %s %s")
y.format(1,2,3) == x % (1,2,3) == "1 2 3"

This allows the format string "style" decision is to be made external  
to the API actually calling the formatting function. Thus, it need not  
matter as much whether the logging API uses % or .format() internally  
-- that only affects the *default* behavior when a bare string is  
passed in.


This could allow for a controlled switch towards the new format string  
format, with a long deprecation period for users to migrate:


1) introduce the above feature, and recommend in docs that people only  
ever use new-style format strings, wrapping the string in  
newstyle_formatstr() when necessary for passing to an API which uses %  
internally.
2) A long time later...deprecate str.__mod__; don't deprecate  
newstyle_formatstr.__mod__.
3) A while after that (maybe), remove str.__mod__ and replace all  
calls in Python to % (used as a formatting operator) with .format() so  
that the default is to use newstyle format strings for all APIs from  
then on.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-09-30 Thread James Y Knight



On Sep 30, 2009, at 10:34 AM, Steven D'Aprano wrote:

E.g.
x = newstyle_formatstr("{} {} {}")
x % (1,2,3) == x.format(1,2,3) == "1 2 3"


Moving along, let's suppose the newstyle_formatstr is introduced.  
What's
the intention then? Do we go through the std lib and replace every  
call

to (say)
   somestring % args
with
   newstyle_formatstr(somestring) % args
instead? That seems terribly pointless to me


Indeed, that *would* be terribly pointless! Actually, more than  
pointless, it would be broken, as you've changed the API from taking  
oldstyle format strings to newstyle format strings.


That is not the suggestion. The intention is to change /nearly  
nothing/ in the std lib, and yet allow users to use newstyle string  
substitution with every API.


Many Python APIs (e.g. logging) currently take a %-type formatting  
string. It cannot simply be changed to take a {}-type format string,  
because of backwards compatibility concerns. Either a new API can be  
added to every one of those functions/classes, or, a single API can be  
added to inform those places to use newstyle format strings.



This could allow for a controlled switch towards the new format
string format, with a long deprecation period for users to migrate:

1) introduce the above feature, and recommend in docs that people
only ever use new-style format strings, wrapping the string in
newstyle_formatstr() when necessary for passing to an API which uses
% internally.


And how are people supposed to know what the API uses internally?


It's documented, (as it already must be, today!).


Personally, I think your chances of getting people to write:
logging.Formatter(newstyle_formatstr("%(asctime)s - %(name)s - % 
(level)s - %(msg)s"))

instead of
logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s")


That's not my proposal.

The user could write either:
logging.Formatter("%(asctime)s - %(name)s - %(level)s - %(msg)s")
(as always -- that can't be changed without a long deprecation  
period), or:
logging.Formatter(newstyle_formatstr("{asctime} - {name} - {level} -  
{msg}")


This despite the fact that logging has not been changed to use {}- 
style formatting internally. It should continue to call "self._fmt %  
record.__dict__" for backward compatibility.


That's not to say that this proposal would allow no work to be done to  
check the stdlib for issues. The Logging module presents one: it  
checks if the format string contains "%{asctime}" to see if it should  
bother to calculate the time. That of course would need to be changed.  
Best would be to stick an instance which lazily generates its string  
representation into the dict. The other APIs mentioned on this thread  
(BaseHTTPServer, email.generator) will work immediately without  
changes, however.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-10-01 Thread James Y Knight



On Oct 1, 2009, at 9:11 AM, Paul Moore wrote:

This seems to me to be almost the same as the previous suggestion of
having a string subclass:

class BraceFormatter(str):
   def __mod__(self, other):
   # Needs more magic here to cope with dict argument
   return self.format(*other)

__ = BraceFormatter

logger.debug(__("The {0} is {1}"), "answer", 42)



I'd rather make that:

class BraceFormatter:
def __init__(self, s):
self.s = s
def __mod__(self, other):
# Needs more magic here to cope with dict argument
return s.format(*other)

__ = BraceFormatter

That is, *not* a string subclass. Then if someone attempts to mangle  
it, or use it for anything but %, it fails loudly.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-10-01 Thread James Y Knight


On Sep 30, 2009, at 1:01 PM, Antoine Pitrou wrote:
Why not allow logging.Formatter to take a callable, which would in  
turn call the

callable with keyword arguments?

Therefore, you could write:
  logging.Formatter("{asctime} - {name} - {level} - {msg}".format)

and then:
  logging.critical(name="Python", msg="Buildbots are down")

All this without having to learn about a separate "compatibility  
wrapper object".


It's a nice idea -- but I think it's better for the wrapper (whatever  
form it takes) to support __mod__ so that logging.Formatter (and  
everything else) doesn't need to be modified to be able to know about  
how to use both callables and "%"ables.


Is it possible for a C function like str.format to have other methods  
defined on its function type?


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-10-01 Thread James Y Knight


On Oct 1, 2009, at 5:54 PM, Nick Coghlan wrote:
I believe classes like fmt_braces/fmt_dollar/fmt_percent will be  
part of

a solution, but they aren't a complete solution on their own. (Naming
the three major string formatting techniques by the key symbols  
involved

is a really good idea though)

1. It's easy to inadvertently convert them back to normal strings.  
If a
formatting API even calls "str" on the format string then we end up  
with

a problem (and switching to containment instead of inheritance doesn't
really help, since all objects implement __str__).


Using containment instead of inheritance makes sure none of the  
*other* operations people do on strings will appear to work, at least  
(substring, contains, etc). I bet explicitly calling str() on a format  
string is even more rare than attempting to do those things.


2. They don't help with APIs that expect a percent-formatted string  
and
do more with it than just pass it to str.__mod__ (e.g. inspecting it  
for

particular values such as '%(asctime)s')


True, but I don't think there's many such cases in the first place,  
and such places can be fixed to not do that as they're found.


Until they are fixed, fmt_braces will loudly fail when used with that  
API (assuming fmt_braces is not a subclass of str).


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-10-01 Thread James Y Knight


On Oct 1, 2009, at 6:19 PM, Steven Bethard wrote:

I see how this could allow a user to supply a {}-format string to an
API that accepts only %-format strings. But I still don't see the
transition strategy for the API itself. That is, how does the %-format
API use this to eventually switch to {}-format strings? Could someone
please lay it out for me, step by step, showing what happens in each
version?



Here's what I said in my first message, suggesting this change.  
Copy&pasted below:


I wrote:
1) introduce the above feature, and recommend in docs that people  
only ever use new-style format strings, wrapping the string in  
newstyle_formatstr() when necessary for passing to an API which uses  
% internally.
2) A long time later...deprecate str.__mod__; don't deprecate  
newstyle_formatstr.__mod__.
3) A while after that (maybe), remove str.__mod__ and replace all  
calls in Python to % (used as a formatting operator) with .format()  
so that the default is to use newstyle format strings for all APIs  
from then on.


So do (1) in 3.2. Then do (2) in 3.4, and (3) in 3.6. I skipped two  
versions each time because of how widely this API is used, and the  
likely pain that doing the transition quickly would cause. But I guess  
you *could* do it in one version each step.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transitioning from % to {} formatting

2009-10-02 Thread James Y Knight


On Oct 2, 2009, at 2:56 PM, Raymond Hettinger wrote:

Do the users get any say in this?
I imagine that some people are heavily invested in %-formatting.

Because there has been limited uptake on {}-formatting (afaict),
we still have limited experience with knowing that it is actually
better, less error-prone, easier to learn/rember, etc.   Outside
a handful of people on this list, I have yet to see anyone adopt
it as the preferred syntax.


Well, I actually think it was a pretty bad idea to introduce {}  
formatting, because %-formatting is well-known in many other  
languages, and $-formatting is used by basically all the rest. So the  
introduction of {}-formatting has always seemed silly to me, and I  
wish it had not happened.


HOWEVER, much worse than having a new, different, and strange  
formatting convention is having *multiple* formatting conventions  
arbitrarily used in different places within the language, with no  
rhyme or reason.


So, given that brace-formatting was added, and that it's been declared  
the way forward, I'd *greatly* prefer it taking over everywhere in  
python, instead of having to use a mixture.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Package install failures in 2.6.3

2009-10-05 Thread James Y Knight


On Oct 5, 2009, at 2:21 PM, Brett Cannon wrote:
I should also mention this bug was not unknown. I discovered it  
after Distribute 0.6 was released as I always run cutting edge  
interpreters. Never bothered to report it until Distribute 0.6.1 was  
released which Tarek fixed in less than a week. I never bothered to  
report it for setuptools as I know it isn't maintained.


It's probably in our best interest to just get people over to  
Distribute, let it continue to hijack setuptools, and slowly let  
that name fade out if it is going to continue to be unmaintained. I  
have to admit I find it really disheartening that we are letting an  
unmaintained project dictate how we fix a bug. I really hope this is  
a one-time deal and from this point forward we all move the  
community towards Distribute so we never feel pressured like this  
again.


Even though the bug was noticed, nobody thought that, just perhaps,  
breaking other software in a minor point release might be a bad idea,  
no matter whether it was updated in less-than-a-week, or mostly- 
unmaintained?


Once you have an API that you encourage people to subclass, *of  
course* it dictates how you can fix a bug.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bug 7183 and Python 2.6.4

2009-10-22 Thread James Y Knight

On Oct 22, 2009, at 11:04 AM, Barry Warsaw wrote:

On Oct 22, 2009, at 10:47 AM, Benjamin Peterson wrote:

2009/10/22 Barry Warsaw :
So does anybody else think bug 7183 should be a release blocker  
for 2.6.4

final, or is even a legitimate but that we need to fix?

I think it cannot hold up a release with out a reproducible code  
snippet.

It may not be reproducible in standard Python, see David's follow up  
to the issue.  If that holds true and we can't reproduce it, I agree  
we should not hold up the release for this.

>>> class Foo(property):
...  __slots__=[]
...
>>> x=Foo()
>>> x.__doc__ = "asdf"
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'Foo' object attribute '__doc__' is read-only

You can't add arbitrary attributes to instances, since some instances  
don't have the slot to put them in.

Is that an equivalent demonstration to that which boost runs into?  
(except, it's using a C type not a python type).

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bug 7183 and Python 2.6.4

2009-10-22 Thread James Y Knight



On Oct 22, 2009, at 3:53 PM, Robert Collins wrote:


On Thu, 2009-10-22 at 13:16 -0400, Tres Seaver wrote:
...

That being said, I can't this bug as a release blocker:  people can
either upgrade to super-current Boost, or stick with 2.6.2 until  
they can.


Thats the challenge Ubuntu faces:
https://bugs.edge.launchpad.net/ubuntu/+source/boost1.38/+bug/457688

We've just announced our Karmic RC, boost 1.40 isn't released, and
python 2.6.3 doesn't work with a released boost :(


If I were running a Linux distro, I'd revert the patch in 2.6.3.

And if I were running a Python release process, I'd revert that patch  
for python 2.6.4, and reopen the bug that it fixed, so a less-breaky  
patch can be made.


James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it

2009-10-25 Thread James Y Knight



On Oct 25, 2009, at 2:50 AM, Terry Reedy wrote:


Alex Martelli wrote:

Next(s) would seem good...


That does not work. It has to be next(iter(s)), and that has been  
tried and eliminated because it is significantly slower.


But who cares about the speed of getting an arbitrary element from a  
set? How can it *possibly* be a problem in a real program?


If you want to optimize python, this operation is certainly not the  
right place to start...


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 Release? 2.7 == last of the 2.x line?

2009-11-02 Thread James Y Knight



On Nov 2, 2009, at 6:24 PM, sstein...@gmail.com wrote:


+1 on 2.7 being the last of the 2.x series.  Enough already!


-1. (not that it matters)

I, personally, haven't even written my first line of 3.x code, nor  
have I had any good reason to.


Me neither.

If I saw the actual end of the line at 2.7, I would actually start  
looking for 3.x versions of my favorite tools and would be much more  
inclined to help push them along ASAP.


I'd probably keep using 2.7 to be able to keep using those tools,  
instead.


Right now, so much that I use on a daily basis doesn't even have a  
3.x roadmap, much less any sort of working implementation, that I  
don't see switching to 3.x ever unless the 2.x line ends, and soon!



I don't see switching to 3.x anytime soon either. But what's the rush?

2.x seems to be a fine edition of Python, why not let it keep going to  
2.8 and beyond? Then you wouldn't have to switch to 3.x at all, and  
that'd save you a ton of work. (and save all the people you will have  
to convince to make a 3.x roadmap and do the port a ton of work too!)


It really sounds like you're saying that switching to 3.x isn't worth  
the cost to you, but you want to force people (including yourself) to  
do so anyways, because ...?


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 Release? 2.7 == last of the 2.x line?

2009-11-02 Thread James Y Knight



On Nov 3, 2009, at 12:06 AM, Guido van Rossum wrote:

Though I imagine what
it really needs is a "quirks mode" parser that is compatible with the
HTML dialect accepted by, say, IE6. Maybe a summer of code project?


Already exists: html5lib.
http://code.google.com/p/html5lib/

Or if you want a faster (yet I think less exact) HTML parser,  
libxml2's HTML parser, via lxml:

http://codespeak.net/lxml/parsing.html#parsing-html

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 Release? 2.7 == last of the 2.x line?

2009-11-03 Thread James Y Knight


On Nov 3, 2009, at 8:55 AM, sstein...@gmail.com wrote:
And, as you point out, if 3.x doesn't start getting the crap beat  
out of it in the real world sooner rather than later, we may find  
ourselves, collectively with a stale 2.x, an under battle-tested  
3.x, and nowhere to go.


If that happens, it's not true that there's *nowhere* to go. A  
solution would be to discard 3.x as a failed experiment, take  
everything that is useful from it and port it to 2.x, and simply  
continue development from the last 2.x release. And from there,  
features can be deprecated and then removed a few releases later, as  
is the usual policy.


Been there, done that, on a couple other projects. It's unfortunate  
when you have to throw out work you've done because it failed to gain  
traction over the thing you tried to replace, but sometimes that's life.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it

2009-11-05 Thread James Y Knight


On Nov 5, 2009, at 6:04 PM, geremy condra wrote:

Perhaps my test is flawed in some way?


Yes: you're testing the speed of something that makes absolutely no  
sense to do in a tight loop, so *who the heck cares how fast any way  
of doing it is*!


Is this thread over yet?

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI comments and ratings, really?

2009-11-12 Thread James Y Knight



On Nov 12, 2009, at 4:11 PM, Ben Finney wrote:
I think Jesse's point (or, if he's not willing to claim it, my  
point) is

that, compared to the mandatory comment system, it makes much *more*
sense to have a mandatory field for “URL to the BTS for this project”.


One might look at the "competition" for inspiration. Looking at CPAN.  
There's no "comments" feature, but there is a "CPAN RT" bug-tracker  
which appears to be a way for users to submit comments/problems about  
packages in a way common to all packages in CPAN, but distinct from  
upstream's bug trackers/lists/etc. I'd assume that gets emailed to the  
listed maintainer of the package as well as being accessible to other  
users, although I don't really have any idea.


e.g.
http://search.cpan.org/~capttofu/DBD-mysql/lib/DBD/mysql.pm

There might be something to be said for providing users a way to  
provide feedback that doesn't require making a accounts in a bazillion  
separate bugtrackers.


*shrug*

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI comments and ratings, really?

2009-11-12 Thread James Y Knight



On Nov 12, 2009, at 5:23 PM, Masklinn wrote:


On 12 Nov 2009, at 22:53 , James Y Knight wrote:

On Nov 12, 2009, at 4:11 PM, Ben Finney wrote:
I think Jesse's point (or, if he's not willing to claim it, my  
point) is

that, compared to the mandatory comment system, it makes much *more*
sense to have a mandatory field for “URL to the BTS for this  
project”.


One might look at the "competition" for inspiration. Looking at  
CPAN. There's no "comments" feature
There is, on search.cpan.org. See http://search.cpan.org/~petdance/ack/ 
 for instance, the link leads to http://cpanratings.perl.org/ (a  
pretty interesting example of the "distributed" nature of cpan in  
fact).


Ah, I see. I totally managed to miss that...I guess that's an  
interesting example of a bad web ui. :)


James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread James Y Knight

On Dec 1, 2009, at 11:08 AM, Martin v. Löwis wrote:

>>> I'd rather prefer to explicitly list what CPython assumes about the
>>> outcome of specific operations. If this is just about &, |, ^, and ~,
>>> then its fine with me.
>> 
>> I'm not even interested in going this far:
> 
> I still am: with your list of assumptions, it is unclear (to me, at
> least) what the consequences are. So I'd rather see an explicit list
> of consequences, instead of buying a pig in a poke.

I think all that needs to be defined is that conversion from unsigned to 
signed, and (negative) signed to unsigned integers have 2's complement wrapping 
semantics, and does not affect the bit pattern in memory.

Stating it that way makes it clearer that all you're assuming is the operation 
of the cast operators, and it seems to me that it implies the other 
requirements.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread James Y Knight


On Jan 7, 2010, at 3:27 PM, Martin v. Löwis wrote:


I've been wondering whether it's possible to release the GIL in the
regex engine during matching.


I don't think that's possible. The regex engine can also operate on
objects whose representation may move in memory when you don't hold
the GIL (e.g. buffers that get mutated). Even if they stay in place -
if their contents changes, regex results may be confusing.


It seems probably worthwhile to optimize for the common case of using  
the regexp engine on an immutable object of type "str" or "bytes", and  
allow releasing the GIL in *that* case, even if you have to keep it  
for the general case.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Improve open() to support reading file starting with an unicode BOM

2010-01-08 Thread James Y Knight


On Jan 8, 2010, at 4:14 PM, Tres Seaver wrote:

I understood this proposal as a general processing guideline, not
something the io library should do (but, say, a text editor).

FWIW, I'm personally in favor of using the UTF-8 signature. If people
consider them crazy talk, that may be because UTF-8 can't possibly  
have
a byte order - hence I call it a signature, not the BOM. As a  
signature,

I don't consider it crazy at all. There is a long tradition of having
magic bytes in files (executable files, Postscript, PDF, ... - see
/etc/magic). Having a magic byte sequence for plain text to denote  
the
encoding is useful and helps reducing moji-bake. This is the reason  
it's
used on Windows: notepad would normally assume that text is in the  
ANSI
code page, and for compatibility, it can't stop doing that. So the  
UTF-8

signature gives them an exit strategy.


Agreed.  Having that marker at the start of the file makes interop  
with

other tools *much* easier.


Putting the BOM at the beginning of UTF-8 text files is not a good  
idea, it makes interop much *worse* on a unix system, not better.  
Without the BOM, most commands do the right thing with UTF-8 text.  
E.g. to concatenate two files:


$ cat file-1 file-2 > file-3

With a BOM at the beginning of the file, it won't work right. Of  
course, you could modify "cat" (and every other stream processing  
command) to know how to consume and emit BOMs, and omit the extra one  
that would show up in the middle of the stream...but even that can't  
work; what about:

$ (cat file-1; cat file-2) > file-3.

Should the shell now know that when you run multiple commands, it  
should eat the BOM emitted from the second command?


Basically, using a BOM in a utf-8 file is just not a good idea: it  
completely ruins interop with every standard unix tool.


This is not to say that Python shouldn't have a way to read a file  
with a UTF-8 BOM: it just shouldn't encourage you to *write* such files.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mailing List archive corruption?

2010-01-19 Thread James Y Knight



On Jan 19, 2010, at 11:07 AM, Barry Warsaw wrote:


On Jan 19, 2010, at 03:50 PM, Vinay Sajip wrote:

When I look at the mailing list archive for python-dev, I see some  
odd stuff at

the bottom of the page:

http://mail.python.org/pipermail/python-dev/2010-January/thread.html#95232

Anyone know what's happened?


WTF?  I think the archives were recently regenerated, so there's  
probably a

fubar there.  CC'ing the postmasters.


That happens if messages had unescaped "From" lines in the middle of  
them.


No doubt, you've now broken every link anyone had ever made into the  
python-dev archives, because now all the article numbers are  
different. BTDT...unfortunately... Pipermail really is quite crappy,  
sigh.


Anyhow, when I did that, I went back to a backup to get the original  
article numbers, and edited the mbox file escaping From lines or  
adding additional empty messages until the newly regenerated article  
numbers matched the originals. I'd highly recommend going through that  
painful process, since I suspect a *lot* of people have links to the  
python-dev archive. Hope you have a backup (or can find caches on  
google or archive.org or something).


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: broken mailing list links in PEP(s?)

2010-05-05 Thread James Y Knight



On May 5, 2010, at 8:22 AM, Barry Warsaw wrote:


On May 5, 2010, at 7:09 AM, Oleg Broytman wrote:


On Wed, May 05, 2010 at 11:43:45AM +0100, Michael Foord wrote:

http://mail.python.org/pipermail/python-list/2000-July/108893.html

which are broken


 Pipermail's links aren't stable AFAIU. The numbering is changing  
over

time.


They're only unstable if you regenerate the archives and the mbox  
file is old enough to have been a victim of a long-fixed delimiter  
bug.  Which is true for python-dev.


And of course if you're paying attention, you can fix the mbox file  
(quoting "From" etc) such that it generates the same numbers as it did  
the first time.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] HEADS UP: Compilation risk with new GCC 4.5.0

2010-05-12 Thread James Y Knight



On May 12, 2010, at 9:13 AM, Jesus Cea wrote:
Short history: new GCC 4.5.0 (released a month ago), when compiling  
with
- -O3, is adding MMX/SSE instructions that requires stack aligned to  
16
byte. This is wrong, since x86 ABI only requires stack aligned to 4  
bytes.


If you compile EVERYTHING with GCC 4.5.0, you are safe (I guess!), but
if your environment has mixed compiled code (for instance, the OS
libraries), you can possibly "core dump". If you have an old compiled
Python and you update libs compiled with GCC 4.5.0, you can crash in  
the

process.

Psyco is showing the issue, but it is not the culprit.  It only leaves
- -correctly- the stack in not 16-byte alignment. But there are  
plenty of

examples of crashes not related to python+psyco.

Proposal: add "-fno-tree-vectorize" to compilation options for  
2.7/3.2.
Warm 2.3/2.4/2.5/2.6/3.0/3.1 users. Or warm users compiling with GCC  
4.5.0.



While assuming the stack is 16byte aligned is undeniably an ABI- 
violation in GCC, at this point, it's surely simpler to just go along:  
the new unofficial ABI for x86 is that the stack must always be left  
in 16-byte alignment...


So, just change psyco to always use 16-byte-aligned stackframes. GCC  
has used 16byte-aligned stackframes for a heck of a long time now (so  
if the stack starts 16byte aligned on entry to a function it will stay  
that way on calls). So usually the only way people run into unaligned  
stacks is via hand-written assembly code or JIT compilers.


I think you'll be a lot happier just modifying Psyco than making  
everyone else in the world change their compiler flags.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] HEADS UP: Compilation risk with new GCC 4.5.0

2010-05-12 Thread James Y Knight



On May 12, 2010, at 10:01 AM, Jesus Cea wrote:

On 12/05/10 15:39, James Y Knight wrote:

While assuming the stack is 16byte aligned is undeniably an
ABI-violation in GCC, at this point, it's surely simpler to just go
along: the new unofficial ABI for x86 is that the stack must always  
be

left in 16-byte alignment...


You can not rule out other software embedding python inside, or
callbacks from foreign code. For instance, Berkeley DB library can do
callbacks to Python code.


So? When calling callback functions, the Berkeley DB library won't  
un-16byte-align the stack, will it? (Assuming it's been compiled with  
gcc in the last 10 years)



Not all the universe is GCC based. For instance, Solaris system
libraries are not compiled using GCC. The world is bigger that Linux/ 
GCC.


If the Solaris compilers don't use 16byte-aligned stackframes, and GCC  
on Solaris/x86 also assumes 16byte-aligned stacks, I guess GCC on  
Solaris/x86 is pretty broken indeed. But for Linux/x86, stacks have  
been de-facto 16byte aligned for so long, you can *almost* excuse the  
ABI violation as unimportant.


But anyways, psyco should keep the stackframes 16byte aligned  
regardless, for performance reasons: even when accessing datatypes for  
which unaligned access doesn't crash, it's faster when it's aligned.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread James Y Knight


On Jun 21, 2010, at 4:29 PM, M.-A. Lemburg wrote:

Here's a little known fact: by changing the Python2 default
encoding to 'undefined' (yes, that's a real codec !), you can disable
all automatic string coercion in Python2.


I tried that once: half the stdlib stops working if you do (for  
example, the re module), so it's not particularly useful for checking  
if your own code is unicode-safe.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

2010-06-22 Thread James Y Knight



On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
Similarly I'd expect (from experience) that a programmer using  
Python to want to take the same approach, sticking with unencoded  
data in nearly all situations.


Yeah. This is a real issue I have with the direction Python3 went: it  
pushes you into decoding everything to unicode early, even when you  
don't care -- all you really wanted to do is pass it from one API to  
another, with some well-defined transformations, which don't actually  
depend on it having being decoded properly. (For example, extracting  
the path from the URL and attempting to open it as a file on the  
filesystem.)


This means that Python3 programs can become *more* fragile in the face  
of random data you encounter out in the real world, rather than less  
fragile, which was the goal of the whole exercise.


The surrogateescape method is a nice workaround for this, but I can't  
help thinking that it might've been better to just treat stuff as  
possibly-invalid-but-probably-utf8 byte-strings from input, through  
processing, to output. It seems kinda too late for that, though: next  
time someone designs a language, they can try that. :)


James___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use of cgi.escape can lead to XSS vulnerabilities

2010-06-23 Thread James Y Knight



On Jun 22, 2010, at 5:14 PM, Craig Younkins wrote:

I suggest rewording the documentation for the method making it more  
clear what it should and should not be used for. I would like to see  
the method changed to properly escape single-quotes, but if it is  
not changed, the documentation should explicitly say this method  
does not make input safe for inclusion in HTML.


Well, it *does* make the input safe for inclusion in HTML...in a  
double-quoted attribute.


The docs could make it clearer that you should always use double- 
quotes around your attribute values when using it, though, I agree.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread James Y Knight



On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:


On 6/24/2010 5:09 PM, Barry Warsaw wrote:

What use case does this address?


Specifically, it's the use case where we (Debian/Ubuntu) plan on  
installing
all Python 3.x packages into /usr/lib/python3/dist-packages.  As of  
PEP 3147,
we can do that without collisions on the pyc files, but would still  
have to
symlink for extension module .so files, because they are always  
named foo.so
and Python 3.2's foo.so won't (modulo PEP 384) be compatible with  
Python 3.3's

foo.so.


If the package has .so files that aren't compatible with other version
of python, then what is the motivation for placing that in a shared
location (since it can't actually be shared)


Because python looks for .so files in the same place it looks for  
the .py files of the same package. E.g., given a module like lxml, it  
contains the following files (among others):

lxml/
lxml/__init__.py
lxml/__init__.pyc
lxml/builder.py
lxml/builder.pyc
lxml/etree.so

And you can only put it in one place. Really, python should store  
the .py files in /usr/share/python/, the .so files in /usr/lib/x86_64- 
linux-gnu/python2.5-debug/, and the .pyc files in /var/lib/python2.5- 
debug. But python doesn't work like that.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-25 Thread James Y Knight



On Jun 25, 2010, at 4:53 AM, Scott Dial wrote:


On 6/24/2010 8:23 PM, James Y Knight wrote:

On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:
If the package has .so files that aren't compatible with other  
version

of python, then what is the motivation for placing that in a shared
location (since it can't actually be shared)


Because python looks for .so files in the same place it looks for the
.py files of the same package.


My suggestion was that a package that contains .so files should not be
shared (e.g., the entire lxml package should be placed in a
version-specific path). The motivation for this PEP was to simplify  
the

installation python packages for distros; it was not to reduce the
number of .py files on the disk.

Placing .so files together does not simplify that install process in  
any

way. You will still have to handle such packages in a special way.



This is a good point, but I think still falls short of a solution. For  
a package like lxml, indeed you are correct. Since debian needs to  
build it once per version, it could just put the entire package (.py  
files and .so files) into a different per-python-version directory.


However, then you have to also consider python packages made up of  
multiple distro packages -- like twisted or zope. Twisted includes  
some C extensions in the core package. But then there are other  
twisted modules (installed under a "twisted.foo" name) which do not  
include C extensions. If the base twisted package is installed under a  
version-specific directory, then all of the submodule packages need to  
also be installed under the same version-specific directory (and thus  
built for all versions).


In the past, it has proven somewhat tricky to coordinate which  
directory the modules for package "foo" should be installed in,  
because you need to know whether *any* of the related packages  
includes a native ".so" file, not just the current package.


The converse situation, where a base package did *not* get installed  
into a version-specific directory because it includes no native code,  
but a submodule *does* include a ".so" file, is even trickier.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] FHS compliance of Python installation

2010-06-26 Thread James Y Knight



On Jun 26, 2010, at 4:35 PM, Matthias Klose wrote:


On 26.06.2010 22:30, C. Titus Brown wrote:

On Sat, Jun 26, 2010 at 10:25:28PM +0200, Matthias Klose wrote:

On 25.06.2010 02:54, Ben Finney wrote:

James Y Knight   writes:

Really, python should store the .py files in /usr/share/python/,  
the
.so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and  
the .pyc
files in /var/lib/python2.5- debug. But python doesn't work like  
that.


+1

So who's going to draft the ???Filesystem Hierarchy Standard  
compliance???

PEP? :-)


This has nothing to do with the FHS.  The FHS talks about data,  
not code.


Really?  It has some guidelines here for object files, etc., at  
least as

of 2004.

http://www.pathname.com/fhs/pub/fhs-2.3.html

A quick scan suggests /usr/lib is the right place to look:

http://www.pathname.com/fhs/pub/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA


agreed for object files, but
http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA
explicitely states "The /usr/share hierarchy is for all read-only  
architecture independent *data* files".


I always figured the "read-only architecture independent" bit was the  
important part there, and "code is data". Emacs's el files go into / 
usr/share/emacs, for instance.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 'hasattr' is broken by design

2010-08-24 Thread James Y Knight

On Aug 24, 2010, at 10:26 AM, Benjamin Peterson wrote:

2010/8/24 P.J. Eby :

At 03:37 PM 8/24/2010 +0200, Hrvoje Niksic wrote:

a) a "business" case of throwing anything other than  
AttributeError from
__getattr__ and friends is almost certainly a bug waiting to  
happen, and

FYI, best practice for __getattr__ is generally to bail with an
AttributeError as soon as you see double underscores in the name,  
unless you

intend to support special attributes.

Unless you're in an old-style class, you shouldn't get an double
underscore methods in __getattr__ (or __getattribute__). If you do,
it's a bug.

Uh, did you see the message that was in response to?

Maybe it should be a bug report?

>>> class Foo(object):
...  def __getattr__(self, name): print "ATTR:",name
...  def __iter__(self): yield 1
...
>>> print list(Foo())
ATTR: __length_hint__
[1]

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 384 status

2010-08-29 Thread James Y Knight

On Aug 29, 2010, at 8:16 AM, Nick Coghlan wrote:
> However, since even platforms other than Windows aren't immune to
> version upgrades of the standard C runtime

Aren't they? I don't know of any other platform that lets you have two versions 
of libc linked into a single address space. Linux has had incompatible libc 
updates in the past, but it was not possible to use both in one program.

I believe BSD works the same way.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] os.path.normcase rationale?

2010-09-24 Thread James Y Knight

On Sep 24, 2010, at 10:53 AM, Paul Moore wrote:
> On 24 September 2010 15:29, Guido van Rossum  wrote:
>> I don't think we should try to reimplement what the filesystem does. I
>> think we should just ask the filesystem (how exactly I haven't figured
>> out yet but I expect it will be more OS-specific than
>> filesystem-specific). It will have to be a new API -- normcase() at
>> least is *intended* to return a case-flattened name on OSes where
>> case-preserving filesystems are the default, and changing it to look
>> at the filesystem would break too much code. For a new use case we
>> need a new API.
> 
> I dug into this once, and as far as I could tell, it's possible to get
> the information on Windows, but there's no way on Linux to "ask the
> filesystem". From my researches, the standard interfaces a filesystem
> has to implement on Linux don't offer any means of asking this
> question.
> 
> Of course, (a) I'm no Linux expert so what do I know, and (b) it may
> well be possible to come up with a "good enough" solution by ignoring
> pathologically annoying theoretical cases.
> 
> I'm happy to provide Windows code if someone needs it.
> Paul

An OSX code sketch is available here (summary: call FSPathMakeRef to get an 
FSRef from a path string, then FSRefMakePath to make it back into a path, which 
will then have the correct case). And note that it only works if the file 
actually exists.

http://stackoverflow.com/questions/370186/how-do-i-find-the-correct-case-of-a-filename

It would indeed be useful to have that be available in Python.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] os.path.normcase rationale?

2010-09-26 Thread James Y Knight


On Sep 26, 2010, at 7:36 AM, Paul Moore wrote:

> On 26 September 2010 09:01, Paul Moore  wrote:
>> On 25 September 2010 23:57, Greg Ewing  wrote:
>>> Paul Moore wrote:
>>> 
 Windows has (I believe) user definable filesystems, too, but the OS
 has "get me the real filename" style calls,
>>> 
>>> Does it really, though? The suggestions I've seen for doing
>>> this involve abusing the short/long filename translation
>>> machinery, and I'm not sure they're guaranteed to return the
>>> actual case rather than something that happens to work.
>> 
>> There's another call available. I've been too lazy to go and look it
>> up, but I'll do so sometime today.
> 
> Hmm, I can't find the one I was thinking of. GetLongFileName correctly
> sets the case of all but the final part, and FindFile can be used to
> find the last part, but that's not what I recall.
> 
> GetFinalPathNameByHandle works, and is documented to do so, but (a) it
> works on an open file handle, so you need to open the file, and (b)
> it's Vista and later only...

Were you thinking of SHGetFileInfo?

http://stackoverflow.com/questions/74451/getting-actual-file-name-with-proper-casing-on-windows

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] os.path.normcase rationale?

2010-10-03 Thread James Y Knight

On Oct 3, 2010, at 9:18 AM, Dan Villiom Podlaski Christiansen wrote:
> A simpler alternative would probably be the F_GETPATH fcntl. An example:

That requires that you have permission to open the file (and to actually do so 
which might have other effects), while the File Manager's FSRef method does not.

If Python adds a cross-platform function to do this canonicalization, users 
don't have to worry about how easy it is to invoke in pure-python...

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils2 scripts

2010-10-08 Thread James Y Knight


On Oct 8, 2010, at 5:24 PM, Gisle Aas wrote:

> On Oct 8, 2010, at 9:22 , Jeroen Ruigrok van der Werven wrote:
> 
>> +1 from me. I sincerely dislike the Perl-esque -m stuff.
> 
> As a Perl/Python guy I have to object to calling the -m stuff Perl-esque.  
> This is a very Pythonish thing.  In the Perl world we never treat modules as 
> scripts; they are separate concepts written separately and installed in 
> separate locations.  There is no feature of perl similar to the Pythonish -m 
> stuff.


Yes there is. -m and -M.

E.g., the widely advertised perl -MCPAN -e install. It's not identical to 
python's -m, to be sure, but it's *similar*.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Support for async read/write

2010-10-19 Thread James Y Knight

On Oct 19, 2010, at 1:47 PM, exar...@twistedmatrix.com wrote:

> Adding more platform wrappers is always nice.  Keep in mind that the quality 
> of most (all?) aio_* implementations is spotty at best, though. On Linux, 
> they will sometimes block (for example, if you fail to align buffers 
> properly, or open a file without O_DIRECT, or if there are too many other aio 
> operations active on the system at the time, etc).  

You're thinking of the linux-specific AIO calls. Those have the properties 
you're describing (which makes them pretty useless for most code too), but 
they're completely different from the aio_* functions.

The POSIX aio_* calls don't do any of that. They aren't syscalls implemented in 
the kernel, they're implemented in glibc. They "simply" create a threadpool in 
your process to call the standard synchronous operations, and make it difficult 
to reliably get completion notification (completion notification takes place 
via Real-Time signals (SIGEV_SIGNAL), which can be dropped if linux runs out of 
space in its RT-signal-queue, and when that happens you get no indication that 
that has occurred. You can also do completion notification via calling a 
function on a thread (SIGEV_THREAD), but, for that, glibc will always spawns a 
brand new thread for each notification, which is quite slow.)

Basically: you shouldn't ever use those APIs. Especially on linux, but probably 
everywhere else. 

So, in conclusion, I disagree that adding wrappers for these would be nice. It 
wouldn't. It would cause some people to think they would be useful things to 
call, and they would always be wrong.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Support for async read/write

2010-10-19 Thread James Y Knight

On Oct 19, 2010, at 6:44 PM, Martin v. Löwis wrote:

>> So, in conclusion, I disagree that adding wrappers for these would be
>> nice. It wouldn't. It would cause some people to think they would be
>> useful things to call, and they would always be wrong.
> 
> We are all consenting adults. If people want to shoot themselves in
> their feet, we let them. For example, we have os.open, even though
> there is no garbage collection for file handles, and we have
> os._exit, even though it doesn't call finalizers.

There's a difference.

os._exit is useful. os.open is useful. aio_* are *not* useful. For anything. If 
there's anything you think you want to use them for, you're wrong. It either 
won't work properly or it will worse performing than the simpler alternatives.

It would absolutely be a waste of time (of both the implementor of the wrapper 
and the poor users who stumble across them in documentation and try to use 
them) to bother adding wrappers to these functions for python. 

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Continuing 2.x

2010-10-27 Thread James Y Knight

On Oct 27, 2010, at 10:22 PM, Kristján Valur Jónsson wrote:

> Hello all.
>  
> So, python 2.7 is in bugfix only mode.  ‘trunk’ is off limit.  So, where does 
> one make improvements to the distinguished, and still very much alive, 2.x 
> series of Python?
> The answer would seem to be “one doesn’t”.  But must it be that way?
>  
> When Morris stopped producing the Oxford III model back in ’57 in favor of 
> new developments, it didn’t spell the end for it.   The plant was sold to 
> India and the Hindustan Ambassador continues to be developed and produced to 
> this day.  It even has fuel injection.
> The Morris Motor Company isn’t around anymore.
>  
> So, here is my suggestion:
> Let’s move the current ‘trunk’ into /branches/afterlife-27.  Open it for 
> submissions from people such as myself that use 2.7 on a regular basis and 
> are willing to give it some extra love.  Host it there without the usual 
> stringent python quality assurance, buildbot support, release management and 
> all that rigmarole.  Open-source it, if you will.
> Svn.python.org already plays host to some other, less official, projects such 
> as stackless, so why not this?

The python community has already decided many times over that Python2 is dead 
and Python3 is the future. So if you want to continue maintaining Python2, that 
means you need to fork it. I think you'd be best off doing so on your own 
infrastructure: convincing the python developers to support such a thing is 
quite unlikely, and furthermore, completely unnecessary.

Unlike the Oxford III, you don't need to be "sold" python2: it's open source, 
you can fork it without any official approval. So, just do it. I wish you best 
of luck, though: most unofficial forks die a lonely death. But, if enough 
people feel like you do, it could become successful.

But I really doubt anyone else is going to want to use it any python2 afterlife 
without stringent quality assurance, multi-platform support releases, and other 
rigamarole. You'd have to set up all that stuff for yourself if you possibly 
hope to attract users. I can't think of any possible use for an unreliably 
maintained version of python2...

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/

2010-11-03 Thread James Y Knight

On Nov 3, 2010, at 11:25 AM, Eric Smith wrote:

> On 11/3/10 10:53 AM, Eric Smith wrote:
> 
>> The problem is that there is no unittest.loader in 2.4, and
>> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We
>> see this problem every time we try and move anything in the stdlib.
> 
> And BTW: for me, this is the strongest reason not to break up modules into 
> packages or otherwise reorganize the stdlib.

This is the strongest reason why I recommend to everyone I know that they not 
use pickle for storage they'd like to keep working after upgrades [not just of 
stdlib, but other 3rd party software or their own software]. :)

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3 transition in Arch Linux

2010-11-04 Thread James Y Knight

On Nov 4, 2010, at 8:43 PM, Stephen J. Turnbull wrote:
> All of the Arch users I know expect Arch to occasionally do radical
> things because they're the right things to do in the long run.

But the previous consensus (at least, as I, and presumably many other people 
understood it) was that python2 would remain the owner of the name 
"/usr/bin/python" for the indefinite future, and python3 would be invoked with 
/usr/bin/python3.

Given that, it's not at all clear that Arch's actions are the right thing to do.

IMO, moving away from that consensus should've been brought up on python-dev 
rather than just one distro just doing it all alone, causing incompatibilities 
and annoyance. If python-dev wants python3 to inherit the name /usr/bin/python, 
then python2 should've been installing a binary called /usr/bin/python2 for a 
couple years ahead of time, and recommending that everyone use that in their #! 
lines, so that the switch could've been done without breaking everything...

> Sure, and Guido should have exercised the Time Machine a little harder
> so that Python 3 never needed to happen.  IOW, this is the price of
> success and wide distribution.

Well, other programming languages seem to have avoided making sweeping 
bidirectionally-incompatible changes, despite being successful and widely 
distributed. But that's a whole other discussion.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3 transition in Arch Linux

2010-11-07 Thread James Y Knight

On Nov 6, 2010, at 9:41 AM, Martin v. Löwis wrote:
> So I don't recall a decision that there shouldn't be a python2
> binary,

The decision to make one would have to be an active decision, since Python has 
never installed one before. If there should be one, then the Python Makefile 
should make one by default.

> nor a decision that anything is done indefinitely
> (it may be that the decision was actually just about 3.1 - changing
> it again for 3.2 would require another decision, but certainly can't
> be ruled out categorically).

When I said "indefinite", I meant "until some point in the future not yet 
determined", with an implied undertone of "not anytime soon".

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Continuing 2.x

2010-11-08 Thread James Y Knight

On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
> Except for making releases that start backporting Python 3 features
> and breaking backwards compatibility gradually (which may or may not
> be a good idea) I don't see the point. There isn't much to do when it
> comes to improving the language, and there is a moratorium anyway.
> Improvements in the standard library can be more easily done in
> external libraries anyway, and then you can release the improved
> libraries for everything from Python 2.4 and forwards if you like.
> 
> So it can be done, but the question is "Why?"

To keep the batteries included?

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Continuing 2.x

2010-11-09 Thread James Y Knight

On Nov 8, 2010, at 6:08 PM, Lennart Regebro wrote:

> 2010/11/8 James Y Knight :
>> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote:
>>> So it can be done, but the question is "Why?"
>> 
>> To keep the batteries included?
> 
> But they'll only be included in > 2.7, which won't be used much, [...]

If there was going to be an official python.org sanctioned Python 2.8 release, 
I'm not at all sure that'd be the case. Since there isn't going to be one, then 
yes, that's probably true.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-10 Thread James Y Knight

On Nov 10, 2010, at 8:47 AM, Michael Foord wrote:
> How about making this explicit (either pep 8 or our developer docs):
> 
> If a module or package defines __all__ that authoritatively defines the 
> public interface. Modules with __all__ SHOULD still respect the naming 
> conventions (leading underscore for private members) to avoid confusing 
> users. Modules SHOULD NOT export private members in __all__.

I don't like the idea of the authoritative definition of a public interface 
being defined based on __all__, because that provides users almost no warning 
that they're using a private API: the __all__ attribute doesn't do anything if 
you aren't using import *. If there was some proposal to make it so that 
accessing an attribute not in __all__ did prevent or somehow warn users that 
they're doing something dangerous, that'd be different, but there isn't such a 
proposal, and I don't even know what such a proposal would look like...

On the other hand, if you make the primary mechanism to indicate privateness be 
a leading underscore, that's obvious to everyone.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py

2010-11-13 Thread James Y Knight

On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote:
> Funny, it shows that the NNTP SSL tests don't check the certificate,
> then.

Unsurprising, given that you need 140 lines of pretty non-obvious python code 
to do so...

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-17 Thread James Y Knight

On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
> (and is a little trickier in the case of module level globals, since those 
> can't be deprecated properly)

People keep saying this, but there have already been examples shown of how to 
do it. I actually think that python should include a way to do so standard -- 
it's a reasonable enough desire, as shown by how many times in this thread the 
inability to do so has been mentioned. If the existing working 3rd-party 
mechanisms aren't good enough for python-dev standards, come up with a new 
way...

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-17 Thread James Y Knight

On Nov 17, 2010, at 10:30 AM, Guido van Rossum wrote:
> On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight  wrote:
>> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote:
>>> (and is a little trickier in the case of module level globals, since those 
>>> can't be deprecated properly)
>> 
>> People keep saying this, but there have already been examples shown of how 
>> to do it. I actually think that python should include a way to do so 
>> standard -- it's a reasonable enough desire, as shown by how many times in 
>> this thread the inability to do so has been mentioned. If the existing 
>> working 3rd-party mechanisms aren't good enough for python-dev standards, 
>> come up with a new way...
> 
> That's quite the distraction from the current thread though. Start
> discussing it on python-ideas, or submit a code fix, or something in
> between. But the hackish way that some 3rd party frameworks use
> (replacing the module object with a class instance in sys.modules) is
> clearly not right for the standard library (I'll explain on
> python-ideas if you insist).

I just don't want people to use the current lack as an excuse to simply remove 
module attributes without prior deprecation (or make a compatibility policy 
which recommends doing such a thing). I'll leave it up to the experts on this 
list (or python-ideas...) to determine how to implement a module-level 
deprecation in a way that isn't considered "hackish". (Or, if there is no such 
way, there's also the alternative of simply never removing module-level names.)

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Breaking undocumented API

2010-11-17 Thread James Y Knight

On Nov 17, 2010, at 11:38 AM, Guido van Rossum wrote:
> Deprecation doesn't *require* logging a warning or raising an
> exception. You can also add a note to the docs, or if it is
> undocumented, just add a comment to the code. (Though if it is in
> widespread use despite being undocumented, a better way would be to
> document it first -- as immediately deprecated if necessary.)
> 
> Deprecation is in the end a way to give people advance warning about
> future changes. The mechanism of the warning doesn't always have to be
> implemented by the interpreter/compiler/parser or whatever other tool.

Well, that's certainly a possible policy. I'd suggest that adding notes to the 
docs after-the-fact is a singularly ineffective way of giving people advance 
warning of feature removal compared to having the interpreter/compiler/parser 
or whatever other tool warn you. And if that's to be python's policy, when it's 
possible to do better, I'm disappointed. (But won't respond further, my point 
is made.)

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-22 Thread James Y Knight

Why don't ya'll just call them "--unichar-width=16/32". That describes 
precisely what the options do, and doesn't invite any quibbling over 
definitions.

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-23 Thread James Y Knight

On Nov 23, 2010, at 6:49 PM, Greg Ewing wrote:
> Maybe Python should have used UTF-8 as its internal unicode
> representation. Then people who were foolish enough to assume
> one character per string item would have their programs break
> rather soon under only light unicode testing. :-)

You put a smiley, but, in all seriousness, I think that's actually the right 
thing to do if anyone writes a new programming language. It is clearly the 
right thing if you don't have to be concerned with backwards-compatibility: 
nobody really needs to be able to access the Nth codepoint in a string in 
constant time, so there's not really any point in storing a vector of 
codepoints.

Instead, provide bidirectional iterators which can traverse the string by byte, 
codepoint, or by grapheme (that is: the set of combining characters + base 
character that go together, making up one thing which a human would think of as 
a character).

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-23 Thread James Y Knight

On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote:
> Or you can give user programs memory indicies, and enjoy the fun as
> the poor developers do things like "pos += 1" which works fine on
> the ASCII data they have lying around, then wonder why they get
> Unicode errors when they take substrings.

a) You seem to be hung up implementation details of emacs. But yes, positions 
should be stored as an byte offset into the utf8 string. NOT as number of 
codepoints since the beginning of the string. Probably you want it to be 
somewhat opaque, so that you actually have to specify whether you wanted to go 
to +1 byte, codepoint, or grapheme.

b) Those poor developers are *already* screwed if they're using pos += 1 when 
pos is a codepoint index and they then take a substring based on that! They 
will get half a character when the string contains combining characters...

Pretending that "codepoints" are a useful abstraction just makes poor 
developers get by without doing the correct thing (incrementing to the next 
grapheme boundary) for a little bit longer. But once you [the language 
implementor] are providing correct abstractions for grapheme movement, it's 
just as easy to also provide an abstraction for codepoint movement, and make 
your low-level implementation of the iterator object be a byte-offset into a 
UTF8 buffer.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-23 Thread James Y Knight

On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote:
> By the way, to send the ball back into your court, I have this feeling
> that the demand for UTF-8 is once again driven by native English
> speakers who are very shortly going to find themselves, and the data
> they are most familiar with, very much in the minority.  Of course the
> market that benefits from UTF-8 compression will remain very large for
> the immediate future, but in the grand scheme of things, most of the
> world is going to prefer UTF-16 by a substantial margin.

No, the demand for UTF-8 is because that's what much of the internet (and not 
coincidentally, unix) world has standardized on. The main pieces of software 
using UTF-16 (Windows, Java) started doing so before it became apparent that 16 
bits wasn't enough to  actually hold a unicode codepoint, so they were actually 
implementing UCS-2. In those days, UCS-2 was a fairly sensible choice.

But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly superior. Not 
because it's smaller -- it's pretty much a tossup -- but because it is an ASCII 
superset, and thus more easily compatible with other software. That also makes 
it most commonly used for internet communication. (So, there's a huge advantage 
for using it internally as well right there: no transcoding necessary for 
writing your HTML output). UTF-16 is incompatible with ASCII, and furthermore, 
it's still a variable-width encoding, with all the same issues that causes. As 
such, there's really very little to be said in favor of it.

If you really want a fixed-width encoding, you have to go to UTF-32, which is 
excessively large. UTF-32 is a losing choice, simply because of the wasted 
memory usage.

But that's all a side issue: even if you do choose UTF-16 as your underlying 
encoding, you *still* need to provide iterators that work by "byte" (only now 
bytes are 16-bits), by codepoint, and by grapheme. Of course, people who 
implement UTF-16 (such as python, java, and windows) often pretend they're 
still implementing UCS-2, and don't bother even providing their users with the 
necessary APIs to do things correctly. Which, you can often get away 
with...just so long as you don't mind that you sometimes end up splitting a 
string in the middle of a codepoint and causing a unicode error!

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 384 final review

2010-11-29 Thread James Y Knight


On Nov 29, 2010, at 8:58 AM, Nick Coghlan wrote:

> The http read only URLs
> didn't work (no diff returned, just "svn: OPTIONS of
> 'http://svn.python.org/python/branches/pep-0384': 200 OK
> (http://svn.python.org)"), 

That was the wrong url: you should've used 
http://svn.python.org/projects/python/branches/pep-0384

James___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ICU

2010-12-02 Thread James Y Knight


On Dec 1, 2010, at 11:45 PM, Alexander Belopolsky wrote:

> On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou  wrote:
>> 
>> Oh, about ICU:
>> 
 Actually, I remember you saying that locale should ideally be replaced
 with a wrapper around the ICU library.
>>> 
>>> By that, I stand - however, I have given up the hope that this will
>>> happen anytime soon.
>> 
>> Perhaps this could be made a GSOC topic.
>> 
> 
> Incidentally, this may also address another Python's Achilles' heel:
> the timezone support.
> 
> http://icu-project.org/download/icutzu.html

Does ICU do anything regarding timezones that datetime + pytz doesn't already 
do? Wouldn't it make more sense to integrate the already-existing-and-pythonic 
pytz into Python than to make a new wrapper based on ICU?

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] The buffer() function

2006-07-13 Thread James Y Knight

On Jul 13, 2006, at 12:52 PM, Thomas Heller wrote:

> IIUC, the buffer object was broken some time ago, but I think it has
> been fixed.  Can the 'status' of the buffer function be changed?
> To quote the next question from the OP:
>
>   "Is buffer safe to use?  Is there an alternative?"
>
> My thinking is that it *is* safe to use, and that there is
> no alternative (but imo also no alternative is needed).

I believe it's safe, except when used on an array.array object.  
However, that's not buffer's fault, but rather a bug in the array class.

The buffer interface requires that, as long as a reference to a  
python object is alive, pointers into its buffer will not become  
invalidated. Array breaks that guarantee.

To fix this, array ought to make a sub-object that this guarantee  
_does_ hold for. And when it needs more storage, simply make a new  
sub-object with more storage. Then, the buffer's reference would be  
to the refcounted sub-object, and thus the associated memory wouldn't  
go away until the buffer was done with it.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Community buildbots

2006-07-15 Thread James Y Knight

On Jul 15, 2006, at 3:15 PM, M.-A. Lemburg wrote:
> Note that it also helps setting the default encoding
> to 'unknown'. That way you disable the coercion of strings
> to Unicode and all the places where this implicit conversion
> takes place crop up, allowing you to take proper action (i.e.
> explicit conversion or changing of the string to Unicode
> as appropriate).

I've tried that before to verify no such conversion issues occurred  
in Twisted, but, as the python stdlib isn't usable like that, it's  
hard to use it to find bugs in any other libraries. (in particular,  
the re module is badly broken, some other stuff was too).

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Capabilities / Restricted Execution

2006-07-16 Thread James Y Knight


On Jul 16, 2006, at 5:42 AM, Scott Dial wrote:

> Talin wrote:
>> Scott Dial wrote:
>>> Phillip J. Eby wrote:
>>>
 A function's func_closure contains cell objects that hold the
 variables.  These are readable if you can set the func_closure  
 of some
 function of your own.  If the overall plan includes the ability  
 to restrict
 func_closure setting (or reading) in a restricted interpreter,  
 then you
 might be okay.
>>>
>>> Except this function (__getattribute__) has been trapped inside of a
>>> class which does not expose it as an attribute. So, you shouldn't be
>>> able to get to the func_closure attribute of the __getattribute__
>>> function for an instance of the Guard class. I can't come up with  
>>> a way
>>> to defeat this protection, at least. If you have a way, then I'd be
>>> interested to hear it.
>>
>> I've thought of several ways to break it already. Some are  
>> repairable,
>> I'm not sure that they all are.
>>
>> For example, neither of the following statements blows up:
>>
>> print t2.get_name.func_closure[0]
>> print object.__getattribute__( t2, '__dict__' )
>>
>> Still, its perhaps a useful basis for experimentation.
>>
>> -- Talin
>
> I quickly poked around it in python and realized that in 2.5 (as  
> opposed
> to the 2.4 python I was playing in) the cell object exposes
> cell_contents.. blargh. So, yes, you can defeat the protection because
> the wrapped instance is exposed.
>
>  print t2.get_name()
>  t2.get_name.func_closure[0].cell_contents.im_self.name = 'poop'
>  print t2.get_name()
>
> Although, your second example with using the object.__getattribute__
> doesn't seem to really be an issue. You retrieved the __dict__ for the
> Guard class which is empty and is something we should not feel  
> concerned
> about being leaked.
>
> Only way I see this as viable is if in "restricted" mode cell_contents
> was removed from cell objects.

Similarly to how function attributes aren't accessible in restricted  
mode. In older versions of python, it's always been possible to get  
the closure variables in non-restricted mode, via mutating func_code...

def get_closure_contents(fun):
   num = len(fun.func_closure)
   vars = ["x%d" % n for n in range(num)]
   defines = ' = '.join(vars) + " = None"
   returns = ', '.join(vars)+','
   exec """
def b():
   %s
   def bb():
 return %s
   return bb
""" % (defines, returns)
   old_code = fun.func_code
   fun.func_code = b().func_code
   result = fun()
   fun.func_code = old_code
   return dict(zip(old_code.co_freevars, result))

def make_secret(x,y):
   def g():
 return x*y
   return g


 >>> secret = f(5,7)
 >>> secret()
35
 >>> get_closure_contents(secret)
{'y': 7, 'x': 5}






___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Dynamic module namspaces

2006-07-16 Thread James Y Knight

On Jul 15, 2006, at 2:38 PM, Johan Dahlin wrote:
> What I want to ask, is it possible to have a sanctioned way to  
> implement
> a dynamic module/namespace in python?
>
> For instance, it could be implemented to allow you to replace the
> __dict__ attribute in a module with a user provided object which
> implements the dictionary protocol.

I'd like this, as well, although my use case is different: I'd like  
to be able to deprecate attributes in a module. That is, if I have:

foo.py:
SOME_CONSTANT = 5

I'd like to be able to do something such that any time anyone  
accessed foo.SOME_CONSTANT, it'd emit a DeprecationWarning.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread James Y Knight

On Jul 18, 2006, at 1:54 PM, Martin v. Löwis wrote:

> Mihai Ibanescu wrote:
>> To follow up on my own email: it looks like, even though in some  
>> locale
>> "INFO".lower() != "info"
>>
>> u"INFO".lower() == "info" (at least in the Turkish locale).
>>
>> Is that guaranteed, at least for now (for the current versions of  
>> python)?
>
> It's guaranteed for now; unicode.lower is not locale-aware.

That seems backwards of how it should be ideally: the byte-string  
upper and lower should always do ascii uppering-and-lowering, and the  
unicode ones should do it according to locale. Perhaps that can be  
cleaned up in py3k?

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-21 Thread James Y Knight

On Jul 21, 2006, at 6:18 AM, Nick Maclaren wrote:
> To cut a long story short, it is impractical for a language run-time
> system to call user-defined handlers with any degree of reliability
> unless the compiled code and run-time interoperate carefully - I have
> been there and done that many times, but few people still working  
> have.
> On architectures with out-of-order execution (and interrupts), you
> have to assume that an interrupt may occur anywhere, even when the
> code does not use the relevant facility.  Floating-point overflow
> in the middle of a list insertion?  That's to be expected.

While this _is_ a real problem, is it _not_ a general problem as you  
are describing it. Processors are perfectly capable of generating  
precise interrupts, and the inability to do so has nothing to do with  
the out-of-order execution, etc. Almost all interrupts are precise.  
The only interesting one which is not, on x86 processors, is the x87  
floating point exception, which is basically for historical reasons.  
It has never been precise, ever since the actual 8087 coprocessor  
chip for the 8086. However, all is not lost: the exception cannot  
occur randomly. It can only occur on *some* floating point  
instruction, even if the instruction is not the one the error  
actually occurred in. So, unless your list insertion code uses  
floating point instructions, you should not get a floating point  
exception during your list insertion.

Also, looking forward, the "simd" floating point instructions (ie mmx/ 
sse/sse2/sse3) _do_ generate precise interrupts. And on x86-64, x87  
instructions are deprecated and everyone is recommended to use the  
simd ones, instead (so, for example, gcc defaults to using them).

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Document performance requirements?

2006-07-21 Thread James Y Knight

On Jul 21, 2006, at 12:45 PM, Giovanni Bajo wrote:
> Jason Orendorff wrote:
>
>>> However, I'm also struggling to think of a case other than list vs
>>> deque where the choice of a builtin or standard library data
>>> structure would be dictated by big-O() concerns.
>>
>> OK, but that doesn't mean the information is unimportant.  +1 on
>> making this something of a priority.  People looking for this info
>> should find it in the obvious place.  Some are unobvious. (How  
>> fast is
>> dict.__eq__ on average? Worst case?)
>
> I also found out that most people tend to think of Python's lists as a
> magical data structure optimized for many operations (like a "rope" or
> something complex like that). Documenting that it's just a bare vector
> (std::vector in C++) would be of great help.

Indeed, I was talking to someone a while back who thought that lists  
were magically hashed, in that he did something like:
dictionary = open("/usr/share/dict/words").readlines()

and then expected:
"word" in dictionary

would be fast. And was very surprised when it turned out to be slow a  
linear search of the list. :)

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 2.4, VS 2005 & Profile Guided Optmization

2006-07-23 Thread James Y Knight

On Jul 23, 2006, at 4:41 PM, Giovanni Bajo wrote:
> I think Martin decided to keep VC71 (Visual Studio .NET 2003) for  
> another
> release cycle. Given the impressive results of VC8 with PGO, and  
> the fact
> that Visual Studio Express 2005 is free forever, I would hope as  
> well for
> the decision to be reconsidered.

Wasn't there a "Free Forever" 2003 edition too, which has since  
completely disappeared? Why do you think that MS won't stop  
distributing the Free Forever VS 2005 once VS 2005+1 comes out, the  
same way they did the 2003 one?

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rounding float to int directly (Re: struct module and coercing floats to integers)

2006-08-02 Thread James Y Knight

On Aug 2, 2006, at 11:26 PM, Raymond Hettinger wrote:
> Also, -10 on changing the semantics of int() to round instead of
> truncate.  The truncating version is found is so many other languages
> and book examples, that it would be a disaster for us to choose a
> different meaning.

I'd be happy to see floats lose their __int__ method entirely,  
replaced by an explicit truncate function.

I've always thought it quite a hack that python floats have implicit  
truncation to ints, and then a random smattering of APIs go to extra  
lengths to explicitly prevent float.__int__ from being called because  
people thought "passing a float makes no sense!". That's right, it  
doesn't, and it _never_ should happen implicitly, not just in those  
particular few cases. Explicit is better than implicit.

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread James Y Knight

On Aug 3, 2006, at 5:47 PM, M.-A. Lemburg wrote:
>> The only way this error could be the right thing is if you were  
>> trying
>> to suggest that he shouldn't mix unicode and bytestrings at all.
>
> Good question. I wonder whether that's a reasonable approach for
> Python 2.x (I'd say it is for Py3k).

It's my understanding that in py3k, there will be no implicit  
conversion, bytestrings and unicodes will never be equal (no matter  
what the contents), and so this wouldn't be an issue. (as u"1" == "1"  
would be the same sort of situation as 1 == "1" is now)

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rounding float to int directly (Re: struct module and coercing floats to integers)

2006-08-03 Thread James Y Knight

On Aug 3, 2006, at 2:34 AM, Greg Ewing wrote:

> Raymond Hettinger wrote:
>
>> -1 on an extra built-in just to save the time for function call
>
> The time isn't the main issue. The main issue
> is that almost all the use cases for round()
> involve doing an int() on it afterwards. At
> least nobody has put forward an argument to
> the contrary yet.

And I bet the main reason why round() in python returns a float is  
because it does in C.

And it does in C because C doesn't have arbitrary size integers, so  
if round returned integers, round(1e+308) couldn't work. In python,  
however, that's no problem, since python does have arbitrarily big  
integers.

There's also round(float("inf")), of course, which wouldn't be  
defined if the result was an integer, but I don't think rounding  
infinity is much of a use case.

And I do think the extension of round to allow the specification of  
number of decimal places was a mistake. If you want that, you  
probably really mean to do something like round(x * 10**y) instead.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread James Y Knight


On Aug 4, 2006, at 12:34 AM, Josiah Carlson wrote:
> As an alternate idea, rather than attempting to .decode('ascii') when
> strings and unicode compare, why not .decode('latin-1')?  We lose the
> unicode decoding error, but "the right thing" happens (in my opinion)
> when u'\xa1' and '\xa1' compare.

Maybe you want those to compare equal, but _I_ want u'\xa1' and '\xc2 
\xa1' to compare equal, so it should obviously use .decode('utf-8')!

(okay, no, I don't really want that.)

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SyntaxError: can't assign to function call

2006-08-10 Thread James Y Knight

On Aug 10, 2006, at 12:01 PM, Josiah Carlson wrote:

>
> "Michael Urman" <[EMAIL PROTECTED]> wrote:
>>
>> On 8/9/06, Michael Hudson <[EMAIL PROTECTED]> wrote:
>>> The question doesn't make sense: in Python, you assign to a name,
>>> an attribute or a subscript, and that's it.
>>
>> Just to play devil's advocate here, why not to a function call via a
>> new __setcall__? I'm not saying there's the use case to justify it,
>> but I don't see anything that makes it a clear abomination or
>> impossible with python's syntax.
>
> Describe the syntax and semantics.  Every time I try to work them  
> out, I
> end up with a construct that makes less than no sense, to be used in
> cases I have never seen. Further, if you want to call a method
> __setcall__ on an object just created, you can use 'x().__setcall__ 
> (y)'.
> There is no reason to muck up Python's syntax.

It makes just as much sense as assigning to an array access, and the  
semantics would be pretty similar. There's similarly "no reason" to  
allow x[5] = True. You can just spell that x.__setitem__(5, True).

x(*args, **kwargs) = val could translate into x.__setcall__(val,  
*args, **kwargs).
x(5) = True could translate into x.__setcall__(True, 5)

Please note I'm actually arguing for this proposal. Just agreeing  
that it is not a completely nonsensical idea.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SyntaxError: can't assign to function call

2006-08-10 Thread James Y Knight

On Aug 10, 2006, at 12:19 PM, James Y Knight wrote:
> Please note I'm actually arguing for this proposal. Just agreeing
> that it is not a completely nonsensical idea.

ERK! Big typo there. I meant to say:

Please note I'm NOT*** actually arguing for this proposal.

Sorry for any confusion.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SyntaxError: can't assign to function call

2006-08-10 Thread James Y Knight

On Aug 10, 2006, at 12:24 PM, Guido van Rossum wrote:
> On 8/10/06, James Y Knight <[EMAIL PROTECTED]> wrote:
>> It makes just as much sense as assigning to an array access, and the
>> semantics would be pretty similar.
>
> No. Array references (x[i]) and attribute references (x.a) represent
> "locations". Function calls represent values. This is no different
> than the distinction between lvalues and rvalues in C.

Yes, function calls cannot be lvalues right now. However, there is no  
reason that a function call _could not_ be an lvalue. That is exactly  
what the addition of __setcall__  would allow.

On Aug 10, 2006, at 12:31 PM, Phillip J. Eby wrote:
> Honestly, it might make more sense to get rid of augmented  
> assignment in Py3K rather than to add this.  It seems that the need  
> for something like this springs primarily from the existence of  
> augmented assignment.

It makes just as much (and just as little) sense to have normal  
assignment to function calls as it does augmented assignment to  
function calls. I don't see any reason to single out augmented  
assignment here.

Anyhow, enough time wasted on this. I don't really think python  
should add this feature, but it _does_ make sense, and would have  
understandable and consistent semantics if it were added.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SyntaxError: can't assign to function call

2006-08-10 Thread James Y Knight

On Aug 10, 2006, at 4:57 PM, Phillip J. Eby wrote:

> However, I'm also not clear that trying to assign to a function  
> call *is* ill-advised.  One of the things that attracted me to  
> Python in the first place is that it had a lot of features that  
> would be considered "hypergeneralization" in other languages, e.g.  
> the ability to create your own sequences, mappings, and callable  
> objects in the first place.
>
> That being said, the benefit of hypergeneralizing assignment seems  
> small compared to its price.

Well, it's a mostly obvious extension of an existing idea, so the  
price doesn't seem all that high. The main problem is that so far,  
there have been 0 convincing use cases. So no matter how moderate the  
price, it's definitely bigger than the benefit. But anyhow, speaking  
of hypergeneralization...since this has 0 use cases anyhow, might as  
well hyperhypergeneralize it...

Well, why should assignment be limited to only local variables, item  
access, and function calls. Why shouldn't you be able to potentially  
assign to _any_ expression!

Since
x + a turns into (very roughly...) x.__add__(a),

then,
x + a = 5 could turn into x.__add__.__setcall__(5, a).

Of course, since normal __add__ functions don't have a __setcall__,  
doing this will raise an error. But, a user defined __add__ could  
have one! And what would such a user defined __add__.__setcall__  
actually *do*? Well, that would be a use case, and I sure don't have  
any of those!

Ta Da. Who's going to make the patch? ;)

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Type of range object members

2006-08-15 Thread James Y Knight


On Aug 15, 2006, at 6:20 PM, Martin v. Löwis wrote:
> Guido van Rossum schrieb:
>> From the Python *user*'s perspective, yes, as much as possible. But
>> I'm still playing with the thought of having two implementation  
>> types,
>> since otherwise we'd have to devote 4 bytes (8 on a 64-bit platform)
>> to the single *bit* telling the difference between the two internal
>> representations.
>
> We had this discussion before; if you use ob_size==0 to indicate
> that it's an int, this space isn't needed in a long int. On a 32-bit
> platform, the size of an int would go up from 12 to 16; if we stop
> using a special-cased allocator (which we should (*)), there isn't
> any space increase on such a platform. On a 64-bit platform, the
> size of an int would go up from 24 bytes to 32 bytes.

But it's the short int that you probably really want to make size  
efficient. Which is of course also doable via something like:

typedef struct {
 PyObject_HEAD
 long ob_islong : 1;
 long ob_ival_or_size : LONG_BITS - 1;
 long ob_digit[0];
} PyIntObject;

There's no particular reason that a short int must be able to store  
the entire range of C "long", so, as many bits can be stolen from it  
as desired.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 3 >

1 - 100 of 283 matches

Mail list logo