>> Then perhaps you misunderstand the goal of the decorator module.
>> The raison d'etre of the module is to PRESERVE the signature:
>> update_wrapper unfortunately *changes* it.
>>
>> When confronted with a library which I do not not know, I often run
>> over it pydoc, or sphinx, or a custom made
Barry Warsaw writes:
> There are really two ways to look at an email message. It's either an
> unstructured blob of bytes, or it's a structured tree of objects.
Indeed!
> Those objects have headers and payload. The payload can be of any
> type, though I think it generally breaks down i
On 03:21 am, ncogh...@gmail.com wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once
On 02:38 am, ba...@python.org wrote:
So, what I'm really asking is this. Let's say you agree that there
are use cases for accessing a header value as either the raw encoded
bytes or the decoded unicode. What should this return:
>>> message['Subject']
The raw bytes or the decoded unicode?
...
>> Somewhat true, though I know it happens 25k times during startup of
>> bzr... And I would be a *lot* happier if startup time was 100ms instead
>> of 400ms.
>
> I don't want to quash your idealism too severely, but it is extremely
> unlikely that you are going to get anywhere near that kind
On Thu, Apr 9, 2009 at 5:53 AM, Aahz wrote:
> On Thu, Apr 09, 2009, Nick Coghlan wrote:
>>
>> Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
>>>
>>> 2.x based applications *are* in a dead end, with the only exit
>>> being portag
On Thu, Apr 9, 2009 at 9:07 PM, Collin Winter wrote:
> On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
> wrote:
> >And I would be a *lot* happier if startup time was 100ms instead
> > of 400ms.
>
> Quite so. We have a number of internal tools, and they find that
> frequently just starting up
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
wrote:
> Greg Ewing wrote:
>> John Arbash Meinel wrote:
>>> And the way intern is currently
>>> written, there is a third cost when the item doesn't exist yet, which is
>>> another lookup to insert the object.
>>
>> That's even rarer still, since
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
wrote:
> Greg Ewing wrote:
>> John Arbash Meinel wrote:
>>> And the way intern is currently
>>> written, there is a third cost when the item doesn't exist yet, which is
>>> another lookup to insert the object.
>>
>> That's even rarer still, since
At 22:26 -0400 04/09/2009, Barry Warsaw wrote:
>There are really two ways to look at an email message. It's either an
>unstructured blob of bytes, or it's a structured tree of objects.
>Those objects have headers and payload. The payload can be of any
>type, though I think it generally breaks do
On Wed, Apr 8, 2009 at 9:31 PM, Michele Simionato
wrote:
> Then perhaps you misunderstand the goal of the decorator module.
> The raison d'etre of the module is to PRESERVE the signature:
> update_wrapper unfortunately *changes* it.
>
> When confronted with a library which I do not not know, I oft
On 9-Apr-09, at 6:24 PM, John Arbash Meinel wrote:
Greg Ewing wrote:
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet,
which is
another lookup to insert the object.
That's even rarer still, since it only happens the
At 22:38 -0400 04/09/2009, Barry Warsaw wrote:
...
>So, what I'm really asking is this. Let's say you agree that there
>are use cases for accessing a header value as either the raw encoded
>bytes or the decoded unicode. What should this return:
>
> >>> message['Subject']
>
>The raw bytes or the
On Apr 9, 2009, at 11:21 PM, Nick Coghlan wrote:
Barry Warsaw wrote:
I don't know whether the parameter thing will work or not, but you're
probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right
approach
for json as
Barry Warsaw wrote:
> I don't know whether the parameter thing will work or not, but you're
> probably right that we need to get the bytes-everywhere API first.
Given that json is a wire protocol, that sounds like the right approach
for json as well. Once bytes-everywhere works, then a text API ca
On Apr 9, 2009, at 10:52 PM, Aahz wrote:
On Thu, Apr 09, 2009, Barry Warsaw wrote:
So, what I'm really asking is this. Let's say you agree that there
are
use cases for accessing a header value as either the raw encoded
bytes or
the decoded unicode. What should this return:
message['Su
On Apr 9, 2009, at 11:11 PM, gl...@divmod.com wrote:
I think this is a problematic way to model bytes vs. text; it gives
text a special relationship to bytes which should be avoided.
IMHO the right way to think about domains like this is a multi-level
representation. The "low level" repres
On 02:26 am, ba...@python.org wrote:
There are really two ways to look at an email message. It's either an
unstructured blob of bytes, or it's a structured tree of objects.
Those objects have headers and payload. The payload can be of any
type, though I think it generally breaks down into "s
On Thu, Apr 09, 2009, Barry Warsaw wrote:
>
> So, what I'm really asking is this. Let's say you agree that there are
> use cases for accessing a header value as either the raw encoded bytes or
> the decoded unicode. What should this return:
>
> >>> message['Subject']
>
> The raw bytes or the de
On Apr 9, 2009, at 12:20 PM, Steve Holden wrote:
PostgreSQL strongly encourages you to store text as encoded columns.
Because emails lack an encoding it turns out this is a most
inconvenient
storage type for it. Sadly BLOBs are such a pain in PostgreSQL that
it's
easier to store the message
On Apr 9, 2009, at 2:25 PM, Martin v. Löwis wrote:
This is an interesting question, and something I'm struggling with
for
the email package for 3.x. It turns out to be pretty convenient to
have
both a bytes and a string API, both for input and output, but I think
email really wants to be re
On Apr 9, 2009, at 11:55 AM, Daniel Stutzbach wrote:
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw wrote:
Anyway, aside from that decision, I haven't come up with an elegant
way to allow /output/ in both bytes and strings (input is I think
theoretically easier by sniffing the arguments).
W
On Apr 9, 2009, at 11:08 AM, Bill Janssen wrote:
Barry Warsaw wrote:
Anyway, aside from that decision, I haven't come up with an
elegant way to allow /output/ in both bytes and strings (input is I
think theoretically easier by sniffing the arguments).
Probably a good thing. It just promote
On Apr 9, 2009, at 8:07 AM, Steve Holden wrote:
The real problem I came across in storing email in a relational
database
was the inability to store messages as Unicode. Some messages have a
body in one encoding and an attachment in another, so the only ways to
store the messages are either as
cmake does not produce relative paths in its generated make and
project files. There is an option CMAKE_USE_RELATIVE_PATHS which
appears to do this but the documentation says:
"""This option does not work for more complicated projects, and
relative paths are used when possible. In general, it i
Greg Ewing wrote:
> John Arbash Meinel wrote:
>> And the way intern is currently
>> written, there is a third cost when the item doesn't exist yet, which is
>> another lookup to insert the object.
>
> That's even rarer still, since it only happens the first
> time you load a piece of code that use
John Arbash Meinel wrote:
And the way intern is currently
written, there is a third cost when the item doesn't exist yet, which is
another lookup to insert the object.
That's even rarer still, since it only happens the first
time you load a piece of code that uses a given variable
name anywhere
2009/4/9 Greg Ewing :
> John Arbash Meinel wrote:
>>
>> And when you look at the intern function, it doesn't use
>> setdefault logic, it actually does a get() followed by a set(), which
>> means the cost of interning is 1-2 lookups depending on likelyhood, etc.
>
> Keep in mind that intern() is cal
John Arbash Meinel wrote:
And when you look at the intern function, it doesn't use
setdefault logic, it actually does a get() followed by a set(), which
means the cost of interning is 1-2 lookups depending on likelyhood, etc.
Keep in mind that intern() is called fairly rarely, mostly
only at mo
On Apr 9, 2009, at 12:06 PM, Martin v. Löwis wrote:
Now that you brought up a specific numbers, I tried to verify them,
and found them correct (although a bit unfortunate), please see my
test script below. Up to 21800 interned strings, the dict takes (only)
384kiB. It then grows, requiring 1536ki
Oleg Broytmann wrote:
> On Thu, Apr 09, 2009 at 04:42:21PM -0400, Steve Holden wrote:
>> If I can't pass a 256-byte string into a BLOB and get it back without
>> anything like this happening then there's *something* in the chain that
>> makes the database useless.
>
> import psycopg2
>
> con = ps
> Also, consider that resizing has to evaluate every object, thus paging
> in all X bytes, and assigning to another 2X bytes. Cutting X by
> (potentially 3), would probably have a small but measurable effect.
I'm *very* skeptical about claims on performance in the absence of
actual measurements. T
> As far as Python 3 goes, I honestly have not yet familiarized myself
> with the changes to the IO infrastructure and what the new idioms are.
> At this time, I can't make any educated decisions with regard to how
> it should be done because I don't know exactly how bytes are supposed
> to work an
On Thu, Apr 9, 2009 at 1:05 PM, "Martin v. Löwis" wrote:
>>> I can understand that you don't want to spend much time on it. How
>>> about removing it from 3.1? We could re-add it when long-term support
>>> becomes more likely.
>>
>> I'm speechless.
>
> It seems that my statement has surprised you,
On Thu, Apr 09, 2009 at 04:42:21PM -0400, Steve Holden wrote:
> If I can't pass a 256-byte string into a BLOB and get it back without
> anything like this happening then there's *something* in the chain that
> makes the database useless.
import psycopg2
con = psycopg2.connect(database="test")
cur
On Thu, Apr 09, 2009, Steve Holden wrote:
>
> import psycopg2 as db
> conn = db.connect(database="maildb", user="@@@", password="@@@",
> host="localhost", port=5432)
> curs = conn.cursor()
> curs.execute("DELETE FROM tst")
> curs.execute("INSERT INTO tst (byt) VALUES (%s)",
> ("".join(
Tony Nelson wrote:
> At 21:24 +0400 04/09/2009, Oleg Broytmann wrote:
>> On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
>>> I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
>>> PostgreSQL has no real support for BLOBs.
>> I think it has - BYTEA data type.
>
Martin v. Löwis wrote:
>> I don't have numbers on how much that would improve CPU times, I would
>> imagine improving 'intern()' would impact import times more than run
>> times, simply because import time is interning a *lot* of strings.
>>
>> Though honestly, Bazaar would really like this, becaus
Alexandre Vassalotti wrote:
> On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou wrote:
>> As for reading/writing bytes over the wire, JSON is often used in the same
>> context as HTML: you are supposed to know the charset and decode/encode the
>> payload using that charset. However, the RFC specifies
> I don't have numbers on how much that would improve CPU times, I would
> imagine improving 'intern()' would impact import times more than run
> times, simply because import time is interning a *lot* of strings.
>
> Though honestly, Bazaar would really like this, because startup overhead
> for us
>> I can understand that you don't want to spend much time on it. How
>> about removing it from 3.1? We could re-add it when long-term support
>> becomes more likely.
>
> I'm speechless.
It seems that my statement has surprised you, so let me explain:
I think we should refrain from making design
...
> I like your rationale (save memory) much more, and was asking in the
> tracker for specific numbers, which weren't forthcoming.
>
...
> Now that you brought up a specific numbers, I tried to verify them,
> and found them correct (although a bit unfortunate), please see my
> test script b
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou wrote:
> As for reading/writing bytes over the wire, JSON is often used in the same
> context as HTML: you are supposed to know the charset and decode/encode the
> payload using that charset. However, the RFC specifies a default encoding of
> utf-8. (
Hi Dan,
Thanks for your interest.
2009/4/6 Dan Schult :
> Hi,
> I'm trying to write a C extension which is a subclass of dict.
> I want to do something like a setdefault() but with a single lookup.
>
> Looking through the dictobject code, the three workhorse
> routines lookdict, insertdict and dic
> So I guess some of it comes down to whether "loweis" would also reject
> this change on the basis that mathematically a "set is not a dict".
I'd like to point out that this was not the reason to reject it.
Instead, this (or, the opposite of it) was given as a reason why this
patch should be acce
At 21:24 +0400 04/09/2009, Oleg Broytmann wrote:
>On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
>> I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
>> PostgreSQL has no real support for BLOBs.
>
> I think it has - BYTEA data type.
So it does; I see that now
> This is an interesting question, and something I'm struggling with for
> the email package for 3.x. It turns out to be pretty convenient to have
> both a bytes and a string API, both for input and output, but I think
> email really wants to be represented internally as bytes. Maybe. Or
> maybe
Alexander Belopolsky wrote:
> On Thu, Apr 9, 2009 at 11:02 AM, John Arbash Meinel
> wrote:
> ...
>> a) Don't keep a double reference to both key and value to the same
>> object (1 pointer per entry), this could be as simple as using a
>> Set() instead of a dict()
>>
>
> There is a reject
Oleg Broytmann wrote:
> On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
>> I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
>> PostgreSQL has no real support for BLOBs.
>
>I think it has - BYTEA data type.
>
But the Python DB adapters appears to require so
Christian Heimes wrote:
> John Arbash Meinel wrote:
>> When I looked at the actual references from interned, I saw mostly
>> variable names. Considering that every variable goes through the python
>> intern dict. And when you look at the intern function, it doesn't use
>> setdefault logic, it actua
On Thu, Apr 09, 2009 at 01:14:21PM -0400, Tony Nelson wrote:
> I use MySQL, but sort of intend to learn PostgreSQL. I didn't know that
> PostgreSQL has no real support for BLOBs.
I think it has - BYTEA data type.
Oleg.
--
Oleg Broytmannhttp://phd.pp.ru/p...@phd.p
(email-sig dropped, as I didn't see Steve Holden's message there)
At 12:20 -0400 04/09/2009, Steve Holden wrote:
>Tony Nelson wrote:
...
>> If you need the data from the message, by all means extract it and store it
>> in whatever form is useful to the purpose of the database. If you need the
>>
John Arbash Meinel wrote:
> When I looked at the actual references from interned, I saw mostly
> variable names. Considering that every variable goes through the python
> intern dict. And when you look at the intern function, it doesn't use
> setdefault logic, it actually does a get() followed by a
On Thu, Apr 9, 2009 at 9:34 AM, John Arbash Meinel
wrote:
> ...
>
>>> Anyway, I the internals of intern() could be done a bit better. Here are
>>> some concrete things:
>>>
>>
>> [snip]
>>
>> Memory usage is definitely something we're interested in improving.
>> Since you've already looked at this
...
>> Anyway, I the internals of intern() could be done a bit better. Here are
>> some concrete things:
>>
>
> [snip]
>
> Memory usage is definitely something we're interested in improving.
> Since you've already looked at this in some detail, could you try
> implementing one or two of your
Hi John,
On Thu, Apr 9, 2009 at 8:02 AM, John Arbash Meinel
wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> I've been doing some memory profiling of my application, and I've found
> some interesting results with how intern() works. I was pretty surprised
> to see that the "interned"
Tony Nelson wrote:
> (email-sig added)
>
> At 08:07 -0400 04/09/2009, Steve Holden wrote:
>> Barry Warsaw wrote:
> ...
>>> This is an interesting question, and something I'm struggling with for
>>> the email package for 3.x. It turns out to be pretty convenient to have
>>> both a bytes and a str
(email-sig added)
At 08:07 -0400 04/09/2009, Steve Holden wrote:
>Barry Warsaw wrote:
...
>> This is an interesting question, and something I'm struggling with for
>> the email package for 3.x. It turns out to be pretty convenient to have
>> both a bytes and a string API, both for input and outp
On Thu, Apr 9, 2009 at 6:01 AM, Barry Warsaw wrote:
> Anyway, aside from that decision, I haven't come up with an elegant way to
> allow /output/ in both bytes and strings (input is I think theoretically
> easier by sniffing the arguments).
>
Won't this work? (assuming dumps() always returns a s
On Thu, Apr 9, 2009 at 17:31, Aahz wrote:
> Please do subscribe to python-dev ASAP; I also suggest that you subscribe
> to python-ideas, because I suspect that this is sufficiently blue-sky to
> start there.
It might also be interesting to the unladen-swallow guys.
Cheers,
Dirkjan
_
On Thu, Apr 09, 2009, John Arbash Meinel wrote:
>
> PS> I'm not yet subscribed to python-dev, so if you could make sure to
> CC me in replies, I would appreciate it.
Please do subscribe to python-dev ASAP; I also suggest that you subscribe
to python-ideas, because I suspect that this is sufficient
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I've been doing some memory profiling of my application, and I've found
some interesting results with how intern() works. I was pretty surprised
to see that the "interned" dict was actually consuming a significant
amount of total memory.
To give the sp
Barry Warsaw wrote:
> Anyway, aside from that decision, I haven't come up with an
> elegant way to allow /output/ in both bytes and strings (input is I
> think theoretically easier by sniffing the arguments).
Probably a good thing. It just promotes more confusion to do things
that way, IMO.
-On [20090409 15:41], Benjamin Peterson (benja...@python.org) wrote:
>It seems your Makefile is outdated. We moved the _fileio.c module
>around a few days, so maybe you just need a make distclean.
Yes, that was the cause. Thanks Benjamin.
--
Jeroen Ruigrok van der Werven / asmodai
イェルーン
2009/4/9 Jeroen Ruigrok van der Werven :
> Just to make sure I am not doing something silly, with a configure line as
> such: ./configure --prefix=/home/asmodai/local --with-wide-unicode
> --with-pymalloc --with-threads --with-computed-gotos, would there be any
> reason why I am getting the followi
Just to make sure I am not doing something silly, with a configure line as
such: ./configure --prefix=/home/asmodai/local --with-wide-unicode
--with-pymalloc --with-threads --with-computed-gotos, would there be any
reason why I am getting the following error with both BSD make and gmake:
make: don
Aahz wrote:
> On Thu, Apr 09, 2009, Nick Coghlan wrote:
>> Martin v. L?wis wrote:
Such a policy would then translate to a dead end for Python 2.x
based applications.
>>> 2.x based applications *are* in a dead end, with the only exit
>>> being portage to 3.x.
>> The actual end of the dead
Michele Simionato wrote:
> On Thu, Apr 9, 2009 at 2:11 PM, Nick Coghlan wrote:
>> One of my hopes for PEP 362 was that I would be able to just add
>> __signature__ to the list of copied attributes, but that PEP is
>> currently short a champion to work through the process of resolving the
>> open i
On Thu, Apr 09, 2009, Nick Coghlan wrote:
>
> Martin v. L?wis wrote:
>>> Such a policy would then translate to a dead end for Python 2.x
>>> based applications.
>>
>> 2.x based applications *are* in a dead end, with the only exit
>> being portage to 3.x.
>
> The actual end of the dead end just ha
On Thu, Apr 9, 2009 at 2:11 PM, Nick Coghlan wrote:
> One of my hopes for PEP 362 was that I would be able to just add
> __signature__ to the list of copied attributes, but that PEP is
> currently short a champion to work through the process of resolving the
> open issues and creating an up to dat
Martin v. Löwis wrote:
> Nick Coghlan wrote:
>> Dirkjan Ochtman wrote:
>>> I have a stab at an author map at http://dirkjan.ochtman.nl/author-map.
>>> Could use some review, but it seems like a good start.
>> Martin may be able to provide a better list of names based on the
>> checkin name<->SSH pu
Michele Simionato wrote:
> On Wed, Apr 8, 2009 at 7:51 PM, Guido van Rossum wrote:
>> There was a remark (though perhaps meant humorously) in Michele's page
>> about decorators that worried me too: "For instance, typical
>> implementations of decorators involve nested functions, and we all
>> know
Barry Warsaw wrote:
> On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
>
>> Guido van Rossum python.org> writes:
>>>
>>> I'm kind of surprised that a serialization protocol like JSON wouldn't
>>> support reading/writing bytes (as the serialized format -- I don't
>>> care about having bytes as va
On Thu, Apr 9, 2009 at 13:10, Antoine Pitrou wrote:
> Sure, but then:
>
json.loads('[]')
> []
json.loads(u'[]'.encode('utf16'))
> Traceback (most recent call last):
> File "", line 1, in
> File "/home/antoine/cpython/__svn__/Lib/json/__init__.py", line 310, in loads
> return _defau
Nick Coghlan wrote:
Eric Smith wrote:
And as a reminder, the py3k-short-float-repr changes are on Rietveld at
http://codereview.appspot.com/33084/show. So far, no comments.
Looks like you were able to delete some fairly respectable chunks of
redundant code!
Wait until you see how much nasty
Dirkjan Ochtman ochtman.nl> writes:
>
> The RFC states
> that JSON-text = object / array, meaning "loads" for '"hi"' isn't
> strictly valid.
Sure, but then:
>>> json.loads('[]')
[]
>>> json.loads(u'[]'.encode('utf16'))
Traceback (most recent call last):
File "", line 1, in
File "/home/anto
Martin v. Löwis wrote:
>> Such a policy would then translate to a dead end for Python 2.x
>> based applications.
>
> 2.x based applications *are* in a dead end, with the only exit
> being portage to 3.x.
The actual end of the dead end just happens to be in 2013 or so :)
Cheers,
Nick.
--
Nick C
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Apr 9, 2009, at 1:15 AM, Antoine Pitrou wrote:
Guido van Rossum python.org> writes:
I'm kind of surprised that a serialization protocol like JSON
wouldn't
support reading/writing bytes (as the serialized format -- I don't
care about having
Eric Smith wrote:
> And as a reminder, the py3k-short-float-repr changes are on Rietveld at
> http://codereview.appspot.com/33084/show. So far, no comments.
I skipped over the actual number crunching parts (the test suite will do
a better job than I will of telling you whether or not you have thos
On Thu, Apr 9, 2009 at 07:15, Antoine Pitrou wrote:
> The RFC also specifies a discrimination algorithm for non-supersets of ASCII
> (“Since the first two characters of a JSON text will always be ASCII
> characters [RFC0020], it is possible to determine whether an octet
> stream is UTF-8, UTF-
[Antoine Pitrou]
Besides, Bob doesn't really seem to care about
porting to py3k (he hasn't said anything about it until now, other than that he
didn't feel competent to do it).
His actual words were: "I will need some help with 3.0 since I am not well versed in the changes to the C API or Pyth
81 matches
Mail list logo