Re: reading internet data to generate random numbers.
Grant Edwards <[EMAIL PROTECTED]> wrote: > Doesn't your OS have an entropy-gathering RN generator built-in? Alternatively, if you want lots of high-quality random numbers, buy a cheap web camera: http://www.lavarnd.org/ . Using data from the Internet is just a bad idea. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Bitching about the documentation...
François Pinard <[EMAIL PROTECTED]> wrote: >[AMK] >> You may suggest that I should process my e-mail more promptly. > > No, I'm not suggesting you how to work, no more that I would accept that > you force me into working your way. If any of us wants to force the > other to speak through robots, that one is not far from unspeakable... > >> This is why things need to go into public trackers, or wiki pages. > > Whatever means the maintainer wants to fill his preservation needs, he > is free to use them. The problem arises when the maintainer wants > imposing his own work methods on others. Let contributors be merely > contributors, and learn how to recognise contributions as such and say > thank you, instead of trying to turn contributors into maintainers. Either I don't understand what you are saying or you are being a hypocrite. Andrew is saying that he doesn't have time to detail with all the messages that get sent to him personally. What do you propose he should do? I think people expect more that a message saying "Thanks for you contribution. PS: Since I don't have time to do anything with it, your message will now be discarded.". Neil -- http://mail.python.org/mailman/listinfo/python-list
PEP: Generalised String Coercion
The title is perhaps a little too grandiose but it's the best I
could think of. The change is really not large. Personally, I
would be happy enough if only %s was changed and the built-in was
not added. Please comment.
Neil
PEP: 349
Title: Generalised String Coercion
Version: $Revision: 1.2 $
Last-Modified: $Date: 2005/08/06 04:05:48 $
Author: Neil Schemenauer <[EMAIL PROTECTED]>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History: 06-Aug-2005
Python-Version: 2.5
Abstract
This PEP proposes the introduction of a new built-in function,
text(), that provides a way of generating a string representation
of an object without forcing the result to be a particular string
type. In addition, the behavior %s format specifier would be
changed to call text() on the argument. These two changes would
make it easier to write library code that can be used by
applications that use only the str type and by others that also
use the unicode type.
Rationale
Python has had a Unicode string type for some time now but use of
it is not yet widespread. There is a large amount of Python code
that assumes that string data is represented as str instances.
The long term plan for Python is to phase out the str type and use
unicode for all string data. Clearly, a smooth migration path
must be provided.
We need to upgrade existing libraries, written for str instances,
to be made capable of operating in an all-unicode string world.
We can't change to an all-unicode world until all essential
libraries are made capable for it. Upgrading the libraries in one
shot does not seem feasible. A more realistic strategy is to
individually make the libraries capable of operating on unicode
strings while preserving their current all-str environment
behaviour.
First, we need to be able to write code that can accept unicode
instances without attempting to coerce them to str instances. Let
us label such code as Unicode-safe. Unicode-safe libraries can be
used in an all-unicode world.
Second, we need to be able to write code that, when provided only
str instances, will not create unicode results. Let us label such
code as str-stable. Libraries that are str-stable can be used by
libraries and applications that are not yet Unicode-safe.
Sometimes it is simple to write code that is both str-stable and
Unicode-safe. For example, the following function just works:
def appendx(s):
return s + 'x'
That's not too surprising since the unicode type is designed to
make the task easier. The principle is that when str and unicode
instances meet, the result is a unicode instance. One notable
difficulty arises when code requires a string representation of an
object; an operation traditionally accomplished by using the str()
built-in function.
Using str() makes the code not Unicode-safe. Replacing a str()
call with a unicode() call makes the code not str-stable. Using a
string format almost accomplishes the goal but not quite.
Consider the following code:
def text(obj):
return '%s' % obj
It behaves as desired except if 'obj' is not a basestring instance
and needs to return a Unicode representation of itself. In that
case, the string format will attempt to coerce the result of
__str__ to a str instance. Defining a __unicode__ method does not
help since it will only be called if the right-hand operand is a
unicode instance. Using a unicode instance for the right-hand
operand does not work because the function is no longer str-stable
(i.e. it will coerce everything to unicode).
Specification
A Python implementation of the text() built-in follows:
def text(s):
"""Return a nice string representation of the object. The
return value is a basestring instance.
"""
if isinstance(s, basestring):
return s
r = s.__str__()
if not isinstance(r, basestring):
raise TypeError('__str__ returned non-string')
return r
Note that it is currently possible, although not very useful, to
write __str__ methods that return unicode instances.
The %s format specifier for str objects would be changed to call
text() on the argument. Currently it calls str() unless the
argument is a unicode instance (in which case the object is
substituted as is and the % operation returns a unicode instance).
The following function would be added to the C API and would be the
equivalent of the text() function:
PyObject *PyObject_Text(PyObject *o);
A reference implementation is available on Sourceforge [1] as a
Revised PEP 349: Allow str() to return unicode strings
[Please mail followups to [EMAIL PROTECTED]
The PEP has been rewritten based on a suggestion by Guido to change
str() rather than adding a new built-in function. Based on my
testing, I believe the idea is feasible. It would be helpful if
people could test the patched Python with their own applications and
report any incompatibilities.
PEP: 349
Title: Allow str() to return unicode strings
Version: $Revision: 1.3 $
Last-Modified: $Date: 2005/08/22 21:12:08 $
Author: Neil Schemenauer <[EMAIL PROTECTED]>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History: 06-Aug-2005
Python-Version: 2.5
Abstract
This PEP proposes to change the str() built-in function so that it
can return unicode strings. This change would make it easier to
write code that works with either string type and would also make
some existing code handle unicode strings. The C function
PyObject_Str() would remain unchanged and the function
PyString_New() would be added instead.
Rationale
Python has had a Unicode string type for some time now but use of
it is not yet widespread. There is a large amount of Python code
that assumes that string data is represented as str instances.
The long term plan for Python is to phase out the str type and use
unicode for all string data. Clearly, a smooth migration path
must be provided.
We need to upgrade existing libraries, written for str instances,
to be made capable of operating in an all-unicode string world.
We can't change to an all-unicode world until all essential
libraries are made capable for it. Upgrading the libraries in one
shot does not seem feasible. A more realistic strategy is to
individually make the libraries capable of operating on unicode
strings while preserving their current all-str environment
behaviour.
First, we need to be able to write code that can accept unicode
instances without attempting to coerce them to str instances. Let
us label such code as Unicode-safe. Unicode-safe libraries can be
used in an all-unicode world.
Second, we need to be able to write code that, when provided only
str instances, will not create unicode results. Let us label such
code as str-stable. Libraries that are str-stable can be used by
libraries and applications that are not yet Unicode-safe.
Sometimes it is simple to write code that is both str-stable and
Unicode-safe. For example, the following function just works:
def appendx(s):
return s + 'x'
That's not too surprising since the unicode type is designed to
make the task easier. The principle is that when str and unicode
instances meet, the result is a unicode instance. One notable
difficulty arises when code requires a string representation of an
object; an operation traditionally accomplished by using the str()
built-in function.
Using the current str() function makes the code not Unicode-safe.
Replacing a str() call with a unicode() call makes the code not
str-stable. Changing str() so that it could return unicode
instances would solve this problem. As a further benefit, some code
that is currently not Unicode-safe because it uses str() would
become Unicode-safe.
Specification
A Python implementation of the str() built-in follows:
def str(s):
"""Return a nice string representation of the object. The
return value is a str or unicode instance.
"""
if type(s) is str or type(s) is unicode:
return s
r = s.__str__()
if not isinstance(r, (str, unicode)):
raise TypeError('__str__ returned non-string')
return r
The following function would be added to the C API and would be the
equivalent to the str() built-in (ideally it be called PyObject_Str,
but changing that function could cause a massive number of
compatibility problems):
PyObject *PyString_New(PyObject *);
A reference implementation is available on Sourceforge [1] as a
patch.
Backwards Compatibility
Some code may require that str() returns a str instance. In the
standard library, only one such case has been found so far. The
function email.header_decode() requires a str instance and the
email.Header.decode_header() function tries to ensure this by
calling str() on its argument. The code was fixed by changing
the line "header = str(header)" to:
if isinstance(header, unicode):
header = header.encode('ascii')
Whether this is truly a bug is questionable since decode_header()
really operates on byte strings, not character strings. Code that
passes it a unicode instance could itself be considered
Re: To the python-list moderator
Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Terry Hancock wrote: > >> I got one of these too, recently. Maybe somebody is turning up the >> screws to get rid of spam that's been appearing on the list? In the future, sending a message to [EMAIL PROTECTED] is suggested rather than posting to only to python-list. What's happening is that Spambayes is marking the message as UNSURE. The message that mailman sends to the sender is unfortunate. The "Message has a suspicious header" notice is misleading because the user did not have any header in their message that caused it to be held (at least normally not). I'm not sure why many legitimate messages are being flagged as UNSURE. I'll look into it. > I've been getting these about once a day lately. at first, I suspected > some kind of "you're posting to quickly"-filter with a manual "okay, > you're whitelisted for another 24 hours" setup, but it seems to block > messages mostly by random. and some messages don't seem to get > through at all. slightly annoying. Hmm, the message should eventually get through since it ends up getting moderated by a person. Maybe they are getting overwhelmed and are making some mistakes. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: How ahead are you guys in the (Python) real world?
Aahz <[EMAIL PROTECTED]> wrote: > My company uses 2.2 and 2.3; we hope to drop 2.2 Real Soon Now. This has been an interesting thread. There has been some discussion on python-dev about doing another 2.3 bugfix release. Based on the number of people still using 2.3, it looks to me like there would be interest. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Peter Hansen <[EMAIL PROTECTED]> wrote: > More precisely, the state of the function is *saved* when a yield > occurs, so you certainly don't *recreate* it from scratch, but merely > restore the state, and this should definitely be faster than creating it > from scratch in the first place. Right. Resuming a generator is faster than calling a function. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Steven D'Aprano <[EMAIL PROTECTED]> wrote: > Have you actually measured this, or are you just making a wild > guess? I haven't timed it until now but my guess it not so wild. I'm pretty familiar with the generator implementation (having written the initial version of it). In Python 2.3, resuming a generator does a small amount of setup and then calls eval_frame(). Calling a function does more setup work and then also calls eval_frame(). > Here is my test, using Python 2.3. I've tried to make the test as > fair as possible, with the same number of name lookups in both > pieces of test code. On my machine t4 is faster than t3. Your test is not so fair because the generator is doing a "while" loop (executing more bytecode instructions) while the function is just returning a value (one instruction). On your machine the function call may be faster due to CPU cache effects or branch prediction. In any case, the difference you are trying to measure is extremely small. Try adding some arguments to the functions (especially keyword arguments). What your test does show is that the speed difference should not come into the decision of which construct to use. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Spam avoidance
Tim Peters <[EMAIL PROTECTED]> wrote: > [Douglas Alan] >>> I've noticed that there is little to no spam in comp.lang.python >>> and am wondering how this is accomplished. > > [Skip Montanaro] >> Most mailing lists which originate on mail.python.org have SpamBayes >> filtering in front of them. > > BTW, python.org uses other gimmicks too, right? For example, I think > Greg Ward set up some other gimmicks to weed out obvious viruses. I'm mostly the guilty party at the moment. Incoming mail on mail.python.org goes through an SMTP server implemented in Python. The server uses SpamBayes to filter spam. We disallow attachments with executable filenames (e.g. .scr). That kills almost all virus mail. We use a number of realtime blackhole lists; they also block quite a lot of virus junk and some spam. There is a set of manually maintained message patterns; those kill some annoying junk that's hard to block in other ways. We do greylisting (two different kinds, actually). Some IP addresses get blackholed using iptables (e.g. zombie machines blasting out virus junk). If SpamBayes is unsure about a message to a list then it gets held for moderation. I suspect there are people working behind the scenes to cleanup the NNTP feed. The short answer to Douglas's question: good tools and a fair amount of elbow grease. :-) Neil -- http://mail.python.org/mailman/listinfo/python-list
