[Tutor] smtplib problems ?

2005-11-21 Thread dave
Hi all,

I am not sure if this is a Python problem not not but here goes

I wrote the following small script to email by boss 'email_addr' a rather 
important reminder and also send a copy to myself 'email_addr2' several times 
a day. 




#!/usr/bin/env python
# -*- coding: iso8859_1 -*- 


from smtplib import SMTP
from time import sleep
import sys


email_SMTP = 'mail.pusspaws.net'
email_addr = '[EMAIL PROTECTED]'
email_addr2 = '[EMAIL PROTECTED]'


def email_remind():
  

    for trys in xrange(10):
        try:
                    
            mail_server = SMTP(email_SMTP)
            
            # email the office with a full email
            blog="""\n\nHi,\n\nHi,This is just a reminder, can you sort me out 
a date for my re-grade exam as soon as possible.
I have tied an email script into my cron (time scheduling) daemon to remind 
you an ever increasing number of times a day ;) - Got to love 
Linux\n\nCheers\n\nDave"""
            
            msg = ('Subject: Friendly reminder :)\r\nFrom: Dave Selby\r\nTo: 
'+\
            email_addr+'\r\n\r\n'+blog+'\r\n\r\n')
            
            mail_server.sendmail('Friendly reminder :)', email_addr, msg)
            mail_server.sendmail('Friendly reminder :)', email_addr2, msg)
            
            mail_server.quit()
            
            # If we get to here, all is well, drop out of the loop
            break
            
        except:
            print 'Mailing error ... Re-trying ... '+str(trys+1)+' of 10\n'
            sleep(300)
            
    if trys==9:
        raise 'Mail Failure\n'+str(sys.exc_type)+'\n'+str(sys.exc_value)
    
email_remind()





It does the job (regrade exam now booked :)) but It succeeds in sending emails 
only about 50% of the time. The other 50% I get ...




Mailing error ... Re-trying ... 1 of 10

Mailing error ... Re-trying ... 2 of 10

Mailing error ... Re-trying ... 3 of 10

Mailing error ... Re-trying ... 4 of 10

Mailing error ... Re-trying ... 5 of 10

Mailing error ... Re-trying ... 6 of 10

Mailing error ... Re-trying ... 7 of 10

Mailing error ... Re-trying ... 8 of 10

Mailing error ... Re-trying ... 9 of 10

Mailing error ... Re-trying ... 10 of 10

Traceback (most recent call last):
  File "/home/dave/my files/andrew_torture/email_remind.py", line 49, in ?
    email_remind()
  File "/home/dave/my files/andrew_torture/email_remind.py", line 43, in 
email_remind
    raise 'Mail Failure\n'+str(sys.exc_type)+'\n'+str(sys.exc_value)
Mail Failure
smtplib.SMTPRecipientsRefused
{'[EMAIL PROTECTED]': (553, "sorry, that domain isn't in my list of allowed 
rcpthosts (#5.7.1)")}



I do not know if the is me not using the SMTP function correctly, I have 
noticed the emails arrive and kmail cannot work out when they were sent, or 
if the problem lies elsewear.

Any suggestions gratefully recieved

Cheers

Dave
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Don't understand this class/constructor call syntax

2011-07-22 Thread dave
Hello,

I'm trying to work on GNU Radio and having trouble understanding some of the
Python code.  I have a C/C++ coding background.  I'm looking at the
ieee802.15.4 code found on CGRAN.  It's about 4 years old and runs but doesn't
function anymore so I'm trying to fully understand it to fix it.

In one file (a file in src/example called cc2420_txtest.py) I have the
following line from a constructor for:

class transmit_path(gr.top_block)
...
...
...
self.packet_transmitter = ieee802_15_4_pkt.ieee802_15_4_mod_pkts(self,
spb=self._spb, msgq_limit=2)



Now in the src/python directory for this project I have ieee802_15_4pkt.py
which has the following class:



class ieee802_15_4_mod_pkts(gr.hier_block2):
"""
IEEE 802.15.4 modulator that is a GNU Radio source.
Send packets by calling send_pkt
"""
def __init__(self, pad_for_usrp=True, *args, **kwargs):[/code]



What I don't understand is the call to the constructor and the constructor
definition.  Since it's using a number of advanced features, I'm having
trouble looking it all up in documentation.

What does it mean to call with spb=self._spb?  In the example file, spb is set
= to 2 and so is self._spb.  Is it a sort of pass by reference like C while
also assigning a value? Why the  ** on kwargs then? as if it is a matrix

(and does anyone have any idea what kwargs are (as opposed to args)?)

I'm uncertain about the first argument, but I guess it must be the
transmit_path object passed in place of the usually implicit self...  I'm just
not sure how Python figures out that it's not pad_for_usrp... magic I guess!


Thanks for your help,
Dave
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Don't understand this class/constructor call syntax

2011-07-23 Thread dave
Thank you for the two explanations.  I think I have a good idea of what is
going on now with the arguments and keyword arguments.

My only remaining question is the pad_for_usrp argument.  The default value is
True so I thought it was a boolean and couldn't have anything to do with the
"self" that was passed to it.  However, I can probably puzzle
that out by looking at how it's used in the code.  

If you want to look at the full code and make any more comments, the code tree
is here: https://www.cgran.org/browser/projects/ucla_zigbee_phy/trunk/src

The example I'm looking at is (I quoted line 56):
https://www.cgran.org/browser/projects/ucla_zigbee_phy/trunk/src/examples/cc2420_txtest.py

The example relies on two files.  This one:
https://www.cgran.org/browser/projects/ucla_zigbee_phy/trunk/src/python/ieee802_15_4.py

And this one (I quoted the class at line 138):
https://www.cgran.org/browser/projects/ucla_zigbee_phy/trunk/src/python/ieee802_15_4_pkt.py

Thanks,
Dave


On Sat, 23 Jul 2011 13:09:07 +1000, Steven D'Aprano wrote
> dave wrote:
> 
> > class transmit_path(gr.top_block)
> [...]
> > self.packet_transmitter = 
> > ieee802_15_4_pkt.ieee802_15_4_mod_pkts(self,
> >   spb=self._spb, msgq_limit=2)
> 
> This calls the ieee802_15_4_mod_pkts initializer (not a constructor -
> - see below) with one positional argument and two keyword arguments.
> 
> The positional argument is "self", that is, the transmit_path instance.
> 
> The keyword arguments are called spb and msgq_limit; spb is set to 
> the value of self._spb, and msgq_limit is set to 2.
> 
> The reason I say this is an initializer and not a constructor is 
> that Python treats the two as different. The constructor that 
> creates the instance is called __new__ not __init__. When __init__ 
> is called, the instance has already been constructed, and is now 
> being initialized. The reason for this is mostly historical, 
> although it is useful.
> 
> (Disclaimer -- so called "old style" or "classic" classes don't have 
> a __new__ method, and you cannot customize the actual creation of 
> the instance, only the initialization.)
> 
> Looking at the ieee802_15_4_mod_pkts initializer:
> 
> > class ieee802_15_4_mod_pkts(gr.hier_block2):
> > def __init__(self, pad_for_usrp=True, *args, **kwargs):[/code]
> 
> As a method, this takes the instance as first argument (called 
> "self"), plus one named argument "pad_for_usrp", an arbitrary number 
> of unnamed positional arguments collected into "args", and an 
> arbitrary number of named keyword arguments collected into "kwargs".
> 
> (Note that args and kwargs are conventions. You could call them 
> anything you like -- the "magic", so to speak, comes from the 
> leading * and ** and not from the names.)
> 
> Given the call:
> 
> ieee802_15_4_mod_pkts(self, spb=self._spb, msgq_limit=2)
> 
> this corresponds to the initializer receiving arguments:
> 
> self = the freshly created ieee802_15_4_mod_pkts instance
> pad_for_usrp = the transmit_path instance doing the calling
> args = an empty tuple (no positional arguments collect)
> kwargs = a dictionary of keyword arguments
>   {'spb': value of _spb of the transmit_path instance,
>'msgq_limit': 2}
> 
> > What I don't understand is the call to the constructor and the constructor
> > definition.  Since it's using a number of advanced features, I'm having
> > trouble looking it all up in documentation.
> > 
> > What does it mean to call with spb=self._spb?  In the example file, spb is 
> > set
> > = to 2 and so is self._spb.  Is it a sort of pass by reference like C while
> > also assigning a value? Why the  ** on kwargs then? as if it is a matrix
> 
> No, this is nothing to do with pass by reference, or pass by value 
> either. This often confuses people coming to Python from some other 
> languages, and if it isn't a FAQ it ought to be. You can read one of 
> my posts on this here:
> 
> http://www.mail-archive.com/tutor%40python.org/msg46612.html
> 
> and the Wikipedia article:
> 
> http://en.wikipedia.org/wiki/Evaluation_strategy
> 
> What it means is that the method being called (in this case, 
> ieee802_15_4_mod_pkts.__init__) sees a keyword argument called 
> "spb". This keyword argument has name "spb", and value whatever 
> self._spb has at the time it is called.
> 
> When Python allocates arguments to the named parameters in a method 
> or function, its basic process is roughly something like this:
> 
> (1) for methods, automati

Re: [Tutor] Don't understand this class/constructor call syntax

2011-07-24 Thread dave
I was dimly aware of the functioning of booleans, but I see now that it
doesn't specify an actual boolean type.  Still, the code confuses me.  Is the
usage of pad_for_usrp consistent with it being treated as a boolean?  Why
would the entire self reference be transmitted then?

Example code again:

class transmit_path(gr.top_block)
[...]
self.packet_transmitter = ieee802_15_4_pkt.ieee802_15_4_mod_pkts(self,
spb=self._spb, msgq_limit=2)

The class from the ieee802_15_4_pkt module:

class ieee802_15_4_mod_pkts(gr.hier_block2):
"""
IEEE 802.15.4 modulator that is a GNU Radio source.

Send packets by calling send_pkt
"""
def __init__(self, pad_for_usrp=True, *args, **kwargs): 
"""
Hierarchical block for the 802_15_4 O-QPSK  modulation.

Packets to be sent are enqueued by calling send_pkt.
The output is the complex modulated signal at baseband.

@param msgq_limit: maximum number of messages in message queue
@type msgq_limit: int
@param pad_for_usrp: If true, packets are padded such that they end up
a multiple of 128 samples

See 802_15_4_mod for remaining parameters
"""
try:
self.msgq_limit = kwargs.pop('msgq_limit') 
except KeyError:
pass

gr.hier_block2.__init__(self, "ieee802_15_4_mod_pkts",
gr.io_signature(0, 0, 0), # Input
gr.io_signature(1, 1, gr.sizeof_gr_complex))  # Output
self.pad_for_usrp = pad_for_usrp

# accepts messages from the outside world
self.pkt_input = gr.message_source(gr.sizeof_char, self.msgq_limit)
self.ieee802_15_4_mod = ieee802_15_4.ieee802_15_4_mod(self, *args,
**kwargs)
self.connect(self.pkt_input, self.ieee802_15_4_mod, self)

def send_pkt(self, seqNr, addressInfo, payload='', eof=False):
"""
Send the payload.

@param seqNr: sequence number of packet
@type seqNr: byte
@param addressInfo: address information for packet
@type addressInfo: string
@param payload: data to send
@type payload: string
"""

if eof:
msg = gr.message(1) # tell self.pkt_input we're not sending any
more packets
else:
FCF = make_FCF()

pkt = make_ieee802_15_4_packet(FCF,
   seqNr,
   addressInfo,
   payload,
   self.pad_for_usrp)
 #print "pkt =", packet_utils.string_to_hex_list(pkt), len(pkt)
msg = gr.message_from_string(pkt)

#ERROR OCCURS HERE (a few functions in while inserting onto the msg 
queue)
self.pkt_input.msgq().insert_tail(msg)
--- End of Forwarded Message ---

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Don't understand this class/constructor call syntax

2011-07-25 Thread dave
Is it even possible to replace the implicit self argument of the initializer
by passing something else?  If so, what would be the syntax.

If you want to look at the code its all here:

https://www.cgran.org/browser/projects/ucla_zigbee_phy/trunk/src

The cc2420_txtest.py is in ./examples and the corresponding ieee802_15_4*.py
files are in ./python (lib contains C++ code accessed via SWIG).

I can probably puzzle it out with this info eventually, but if you want to
comment further feel free.

Thanks for your help

Dave



On Mon, 25 Jul 2011 10:26:11 +1000, Steven D'Aprano wrote
> dave wrote:
> > I was dimly aware of the functioning of booleans, but I see now that it
> > doesn't specify an actual boolean type.  Still, the code confuses me.  Is 
> > the
> > usage of pad_for_usrp consistent with it being treated as a boolean?  Why
> > would the entire self reference be transmitted then?
> 
> Parameter passing in Python is fast -- the object (which may be 
> large) is not copied unless you explicitly make a copy. So it is no 
> faster to pass a big, complex object than a lightweight object like 
> True or False.
> 
> (Implementation note: in CPython, the main Python implementation 
> which you almost certainly are using, objects live in the heap and 
> are passed around as pointers.)
> 
> The code you show isn't very illuminating as far as pad_for_usrp 
> goes. All that happens is that it gets stored as an attribute, then 
> later gets passed on again to another function or class:
> 
> > class ieee802_15_4_mod_pkts(gr.hier_block2):
> ...
> > self.pad_for_usrp = pad_for_usrp
> 
> > def send_pkt(self, seqNr, addressInfo, payload='', eof=False):
> ...
> > pkt = make_ieee802_15_4_packet(FCF,
> >seqNr,
> >addressInfo,
> >payload,
> >self.pad_for_usrp)
> 
> So it's *consistent* with being used as a bool, or anything else for 
> that matter! I expect that make_ieee802_15_4_packet may be the thing 
> that actually does something useful with pad_for_usrp.
> 
> Another thing to look for is the transmit_path class itself. If it 
> has a __len__, __bool__ or __nonzero__ method, then it has 
> customized the way it appears as a boolean. If it has none of those 
> methods, then it will always be considered true-valued, and I can't 
> imagine why it is being used as pad_for_usrp instead of just passing 
> True.
> 
> But without looking at the rest of the code, I can't really tell for 
> sure.
> 
> -- 
> Steven
> 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Don't understand this class/constructor call syntax

2011-07-28 Thread dave
Yes that is roughly what I meant.  GNU Radio uses a lot of sub-classing--if
this is the correct term.  For example all blocks inherit hier_block2 which
has methods such as connect for connecting two blocks together.  I wondered if
the instance named self wasn't being passed as a replacement for the implicit
self parameter rather than in place of the pass_as_USRP=True parameter.

Steven D'Aprano also replied on this subject and if I understand him, then it
would require a special syntax that is not present in the GNU Radio code.

Dave





On Mon, 25 Jul 2011 08:08:54 +0100, Alan Gauld wrote
> dave wrote:
> > Is it even possible to replace the implicit self argument of the initializer
> > by passing something else?  If so, what would be the syntax.
> 
> Im not sure  this is what you mean but...
> 
> When you call a method on an object like:
> 
> class MyClass:
> def aMethod(self,spam): pass
> 
> anObject= MyClass()
> anObject.aMethod(42)
> 
> You could replace the last line with:
> 
> MyClass.aMethod(anObject, 42)
> 
> This explicitly specifies the value of self in aMethod()
> 
> So you could in theory pass any object into the method,
> although in most cases it would result in an error.
> 
> Is that what you mean?
> 
> Alan G.
> 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] iPython check if user running script is root user (Linux)

2012-05-31 Thread Dave
Hi. What is the right way to have an iPython script check to see if the
user is currently root?

Here's how I do it in bash scripts:


## CHECK USERNAME PRIVILEGE


if [ $(id -u) != "0" ];then
   echo "This script must be run as root."
   exit 1
fi
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] iPython check if user running script is root user (Linux)

2012-05-31 Thread Dave
Thanks. I like the integer-based option.

Since this is my first question to the list, is it appropriate to reply
with a "thanks, that solved it" or is that considered unnecessary?

On Thu, May 31, 2012 at 6:30 PM, Emile van Sebille  wrote:

> On 5/31/2012 3:21 PM Dave said...
>
>  Hi. What is the right way to have an iPython script check to see if the
>> user is currently root?
>>
>
> Googling for "python check if user is root" yields the answer:
>
>
> import os, sys
>
> # if not root...kick out
> if not os.geteuid()==0:
>sys.exit("\nOnly root can run this script\n")
>
>
> See http://code.activestate.com/**recipes/299410-root-access-**
> required-to-run-a-script/<http://code.activestate.com/recipes/299410-root-access-required-to-run-a-script/>
>
> Emile
>
>
> __**_
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] special attributes naming confusion

2012-06-06 Thread Dave
I was reading some tutorial material on creating iterators. It shows the
following example implementation of an iterator:

class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, data):
self.data = data
self.index = len(data)
def __iter__(self):
return self
def next(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]


My question is how was I supposed to kinow that the function I call using
the name iter() is implemented using the name __iter__()?

Is there a rule that describes when I would implement an attribute name
with leading and trailing double underscores, and then call it without
those underscores? How many names like this exist in Python? Are these
special cases or is there a general rule that leading and trailing double
underscores get dropped when calling functions that were implemented with
these names? I'm trying to understand the big picture as far as how Python
works when it comes to this situation. Thanks.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] special attributes naming confusion

2012-06-06 Thread Dave
On Wed, Jun 6, 2012 at 3:30 PM, Prasad, Ramit wrote:

> > My question is how was I supposed to kinow that the function I call
> using the
> > name iter() is implemented using the name __iter__()?
> >
> > Is there a rule that describes when I would implement an attribute name
> with
> > leading and trailing double underscores, and then call it without those
> > underscores? How many names like this exist in Python? Are these special
> cases
> > or is there a general rule that leading and trailing double underscores
> get
> > dropped when calling functions that were implemented with these names?
> I'm
> > trying to understand the big picture as far as how Python works when it
> comes
> > to this situation. Thanks.
>
> They are listed here
> http://docs.python.org/reference/datamodel.html#specialnames
>
> Ramit
>
>
Thank you. That's a good start. It appears to answer half my question. It
tells me about special names like __new__, __init__, etc. And there is also
mention of __iter__(self) on that page too. But I don't see any discussion
of the convention regarding mappings from those names to the typical names
used to call the functions in code. Unless I'm overlooking it, that page
doesn't explain how to generalize the above example where calling the
function by the name iter() actually calls the implementation named
__iter__(). Are the leading and trailing double underscores simply dropped
always? (It doesn't seem that simple because functions like __init__ are
called behind the scenes.)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] special attributes naming confusion

2012-06-06 Thread Dave
On Wed, Jun 6, 2012 at 3:40 PM, Mark Lawrence wrote:

> On 06/06/2012 20:19, Dave wrote:
>
>> I was reading some tutorial material on creating iterators. It shows the
>> following example implementation of an iterator:
>>
>> class Reverse:
>> """Iterator for looping over a sequence backwards."""
>> def __init__(self, data):
>> self.data = data
>> self.index = len(data)
>> def __iter__(self):
>> return self
>> def next(self):
>> if self.index == 0:
>> raise StopIteration
>> self.index = self.index - 1
>> return self.data[self.index]
>>
>>
>> My question is how was I supposed to kinow that the function I call using
>> the name iter() is implemented using the name __iter__()?
>>
>> Is there a rule that describes when I would implement an attribute name
>> with leading and trailing double underscores, and then call it without
>> those underscores? How many names like this exist in Python? Are these
>> special cases or is there a general rule that leading and trailing double
>> underscores get dropped when calling functions that were implemented with
>> these names? I'm trying to understand the big picture as far as how Python
>> works when it comes to this situation. Thanks.
>>
>>
> Try this to start with http://docs.python.org/**reference/datamodel.html#*
> *special-method-names<http://docs.python.org/reference/datamodel.html#special-method-names>.
> Note this is for Python 2.7.3, there may be differences in Python 3.x.
>
> --
>

Actually, I think I'm getting it now... as I read more of this page I see
that there is no single generalization. These are indeed all special cases.

But the documentation does appear to be incomplete. It leaves out the
mapping to the name or symbol that should be used to call the special
function in some cases. In particular, in the case of __iter(self), which
is one of the first ones I looked at, it doesn't mention that this is
usually called via iter(). It does mention how it would be called for
mapping containers, however (i.e., iterkeys()).
 object.__iter__(*self*)

This method is called when an iterator is required for a container. This
method should return a new iterator object that can iterate over all the
objects in the container. For mappings, it should iterate over the keys of
the container, and should also be made available as the method iterkeys().

But as I read more, I see that much of the documentation does mention how
these special method names are called. For example:

object.__lt__(*self*, *other*) object.__le__(*self*,
*other*)¶<http://docs.python.org/reference/datamodel.html#object.__le__>
object.__eq__(*self*, *other*) object.__ne__(*self*, *other*) object.__gt__(
*self*, *other*) object.__ge__(*self*, *other*)

New in version 2.1.

These are the so-called “rich comparison” methods, and are called for
comparison operators in preference to
__cmp__()<http://docs.python.org/reference/datamodel.html#object.__cmp__>below.
The correspondence between operator symbols and method names is as
follows: xy call x.__ne__(y), x>y calls x.__gt__(y),
and x>=ycalls
x.__ge__(y).
I think there is enough info at this page to answer my question as well as
I need it answered right now. Thanks.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] special attributes naming confusion

2012-06-06 Thread Dave
I'm not sure where this comment belongs, but I want to share my perspective
on the documentation of these special method names. In the following
section there is an inconsistency which could be confusing to someone just
learning Python (e.g., me).

In the sentence on implementing custom mapping objects, the recommended
method names are listed as the short calling names: eys(), values(), items(),
etc.

But, in contrast, in the sentence on implementing sequence types, the
recommended method names are listed as the double underscore internal
implementation names: __add__(), __radd__(), __iadd__(), __mul__(), etc.

Here's the section of the documentation with this inconsistency. I think it
would help to use one or the other of these pairs (calling name vs.
internal implementation name) consistently in this section.

3.4.6. Emulating container
types¶<http://docs.python.org/reference/datamodel.html#emulating-container-types>

The following methods can be defined to implement container objects.
Containers usually are sequences (such as lists or tuples) or mappings
(like dictionaries), but can represent other containers as well. The first
set of methods is used either to emulate a sequence or to emulate a
mapping; the difference is that for a sequence, the allowable keys should
be the integers *k* for which 0 <= k < N where *N* is the length of the
sequence, or slice objects, which define a range of items. (For backwards
compatibility, the method
__getslice__()<http://docs.python.org/reference/datamodel.html#object.__getslice__>(see
below) can also be defined to handle simple, but not extended slices.)
It is also recommended that mappings provide the methods keys(), values(),
items(), has_key(), get(), clear(), setdefault(), iterkeys(), itervalues(),
iteritems(), pop(), popitem(),
copy()<http://docs.python.org/library/copy.html#module-copy>,
and update() behaving similar to those for Python’s standard dictionary
objects. The 
UserDict<http://docs.python.org/library/userdict.html#module-UserDict>module
provides a
DictMixin class to help create those methods from a base set of
__getitem__()<http://docs.python.org/reference/datamodel.html#object.__getitem__>,
__setitem__()<http://docs.python.org/reference/datamodel.html#object.__setitem__>,
__delitem__()<http://docs.python.org/reference/datamodel.html#object.__delitem__>,
and keys(). Mutable sequences should provide methods append(), count(),
index(), extend(), insert(), pop(), remove(), reverse() and sort(), like
Python standard list objects. Finally, sequence types should implement
addition (meaning concatenation) and multiplication (meaning repetition) by
defining the methods
__add__()<http://docs.python.org/reference/datamodel.html#object.__add__>,
__radd__() <http://docs.python.org/reference/datamodel.html#object.__radd__>,
__iadd__() <http://docs.python.org/reference/datamodel.html#object.__iadd__>,
__mul__() <http://docs.python.org/reference/datamodel.html#object.__mul__>,
__rmul__() <http://docs.python.org/reference/datamodel.html#object.__rmul__>and
__imul__() 
<http://docs.python.org/reference/datamodel.html#object.__imul__>described
below; they should not define
__coerce__()<http://docs.python.org/reference/datamodel.html#object.__coerce__>or
other numerical operators. It is recommended that both mappings and
sequences implement the
__contains__()<http://docs.python.org/reference/datamodel.html#object.__contains__>method
to allow efficient use of the
in operator; for mappings, in should be equivalent of has_key(); for
sequences, it should search through the values. It is further recommended
that both mappings and sequences implement the
__iter__()<http://docs.python.org/reference/datamodel.html#object.__iter__>method
to allow efficient iteration through the container; for mappings,
__iter__() 
<http://docs.python.org/reference/datamodel.html#object.__iter__>should
be the same as
iterkeys(); for sequences, it should iterate through the values.
This is in addition to some missing documentation regarding mention how
some of the special method names should be conveniently called in code.
(See below.)


On Wed, Jun 6, 2012 at 3:59 PM, Dave  wrote:

>
>
> On Wed, Jun 6, 2012 at 3:40 PM, Mark Lawrence wrote:
>
>> On 06/06/2012 20:19, Dave wrote:
>>
>>> I was reading some tutorial material on creating iterators. It shows the
>>> following example implementation of an iterator:
>>>
>>> class Reverse:
>>> """Iterator for looping over a sequence backwards."""
>>> def __init__(self, data):
>>> self.data = data
>>> self.index = len(data)
>>> def __iter__(self):
>>> return self
>>> def next(self):
>>> if self.index == 0:
>>>   

[Tutor] Global presets ?

2004-12-04 Thread Dave S
Hi there,
I have some common data directories, like
/home/dave/mygg/gg1.3/logs
/home/dave/mygg/gg1.3/data
/home/dave/mygg/gg1.3/datacore
/home/dave/mygg/gg1.3/arch_data
which increasing numbers of scripts are accessing. At the begining of 
each script
I end up putting in declarations like

arch_data_dir='/home/dave/mygg/gg1.3/arch_data'
data_dir='/home/dave/mygg/gg1.3/data'

over & over. This is OK until I want to move a directory
Somewhere I read about importing a script to define common globals for 
all the scripts that import it.

I tried this, and failed - the variable was only valid for the module, 
to be expected really :)

Can anyone make a suggestion howto set up common global presets.
Cheers
Dave

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Global presets ?

2004-12-04 Thread Dave S
Thanks Guys,
They are both good ways of getting round my problem, I appreciate your 
input & will have a play.

Cheers
Dave
:-) :-) :-) :-)
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Global presets ?

2004-12-04 Thread Dave S
Alan Gauld wrote:
have you considered making the root directory an environment variable?
That way you can read the value (os.getenv) at the start of the
script.
And if you ever need to move the structure you can simply change the
environment value. It also means different users can use their own
structures by defining their own environment value...
 


# File myvars.py
value1 = 42
value2 = 'spam'
 

Got ya so far ..
#
# File: prog1.py
import myvars
localvar = myvars.value1
myvars.value2 = 'Alan'
 

Never thought of setting 'myvars.value2 = 'Alan''  I guess this would 
just set the variable in the myvars namespace since it could not change 
myvars.py itself.

##
#  File prog2.py
import myvars
newvar = myvars.value2
 

With you ...
print myvars.value1 - 27
 

Have I misunderstood, should this not be 42 ? Typo or me not understanding ?

##
Does that help?
Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld

 

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Accuracy of time.sleep()

2004-12-04 Thread Dave S
OK I may be pushing it,  ;-)
I need a script to sleep from any point to 8:05AM when in needs to 
re-start.

So I calculate the number of seconds with the following 

def secs_till_805():
   # Returns the number of seconds till 8:05AM
  
   secs_5min=5*60
   secs_24hr=24*60*60
   secs_8hr=(8*60*60)+secs_5min
   secs_8hr_24hr=secs_24hr-secs_8hr
  
   hours=int(strftime('%H'))
   mins=int(strftime('%M'))
   secs=int(strftime('%S'))

   sec_left=secs_24hr-((hours*60*60)+(mins*60)+secs)
  
   # If we are before 8:05, then ...
   if sec_left>secs_8hr_24hr:
   return sec_left-secs_8hr_24hr
  
   # If we are after 8:05, then ...
   return sec_left+secs_8hr


Then I ...
sleep(secs_till_805())
I expected the script to re-start 2-3 seconds after 8:05, python 
reloading after a long sleep etc, what I get is the script restarting at 
08:04.55, earlier ???

OK this is not a world stopping problem, more of a curiosity.
Any suggestions
Dave
 
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Accuracy of time.sleep()

2004-12-04 Thread Dave S

I expected the script to re-start 2-3 seconds after 8:05, python 
reloading after a long sleep etc, what I get is the script restarting 
at 08:04.55, earlier ???

OK this is not a world stopping problem, more of a curiosity.
Any suggestions
Dave
  
Thanks for your input guys, I have used cron (fcron for me) in the past 
(Im a Gentoo Linux guy :-) ) I was just trying to keep it all pure 
python. As  I said its more of a curiosity.

It must be cummulative error over 10s of thousands of seconds. Its a 
bodge (& cron or at are better) but I suppose I could calculate seconds 
to 8:05 sleep(seconds*0.95), re calculate secs to 8:05 sleep(seconds) 
which should reduce the error to almost zip.

Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Accuracy of time.sleep()

2004-12-04 Thread Dave S
Tim Peters wrote:
First, thank you for such a brilliant answer :-)
[Dave S <[EMAIL PROTECTED]>]
 

OK I may be pushing it,  ;-)
   

Yup .
 

I need a script to sleep from any point to 8:05AM when in needs to
re-start.
So I calculate the number of seconds with the following 
def secs_till_805():
  # Returns the number of seconds till 8:05AM
  secs_5min=5*60
  secs_24hr=24*60*60
  secs_8hr=(8*60*60)+secs_5min
  secs_8hr_24hr=secs_24hr-secs_8hr
  hours=int(strftime('%H'))
  mins=int(strftime('%M'))
  secs=int(strftime('%S'))
   

Ouch.  Never try to pick apart the current time by computing it more
than once.  For example, if the time at the start of that block is
just a fraction of a second before 9AM, it's quite possible you'll end
up with hours==8 and mins==secs==0 (because the time is 8:59:59 at the
time you do the "%H" business, and but it's 9:00:00 by the time you
get to "%M").  That would throw you off by an hour.  The same kind of
thing can happen a little before the (any) minute changes too.
 

This is a possibility that had not enterd my mind, but also very true. 
Thanks for saving me from that particular black hole.

Its always that 1 in a thousand possibility that sends things south at 
the worst possible moment !

  sec_left=secs_24hr-((hours*60*60)+(mins*60)+secs)
  # If we are before 8:05, then ...
  if sec_left>secs_8hr_24hr:
  return sec_left-secs_8hr_24hr
  # If we are after 8:05, then ...
  return sec_left+secs_8hr
   

Here's a different way, computing current time only once, and using
the datetime module to do all the fiddly work:
def seconds_until(h, m=0, s=0):
   from datetime import datetime, time, timedelta
   target_time = time(h, m, s)
   now = datetime.now()
   target = datetime.combine(now.date(), target_time)
   if target < now:
   target += timedelta(days=1)
   diff = target - now
   return diff.seconds + diff.microseconds / 1e6
This returns seconds as a float, which is good (Python's time.sleep()
can make sense of floats, and sleep for fractional seconds).
 

OK Im pydoc'ing & looking at datetime, a module I have not explored 
before. This is stretching me a bit but its a good way to learn.

Then I ...
sleep(secs_till_805())
   

With the above, you'd do
   time.sleep(seconds_until(8, 5))
instead.
 

I expected the script to re-start 2-3 seconds after 8:05, python
reloading after a long sleep etc, what I get is the script restarting at
08:04.55, earlier ???
   

You'll probably never know why for sure.  Python calls platform C
library gimmicks to sleep, which in turn invoke operating system
facilities.  Understanding the whole story would require that you
understand everything all of those do.
 

If only I had the time ... (no pun intended)
[later]
 

It must be cummulative error over 10s of thousands of seconds.
   

Maybe.
 

Its a bodge (& cron or at are better) but I suppose I could calculate seconds
to 8:05 sleep(seconds*0.95), re calculate secs to 8:05 sleep(seconds)
which should reduce the error to almost zip.
   

That's also a good idea in order to avoid surprises due to crossing
daylight time boundaries (assuming you mean 8:05 according to the
local wall clock).  Here's a function building on the above:
def sleep_until(h, m=0, s=0):
   from time import sleep
   while True:
   delay = seconds_until(h, m, s)
   if delay < 10.0:
   sleep(delay)
   return
   else:
   sleep(delay / 2)
 

Thats neat, and more elegent than my hamfisted attempt, I err might 
borrow it for my code, on a tempory basis you understand ;-)

sleep_secs=secs_till_805()
log('II','ftsed','Sleeping for '+str(sleep_secs)+' Seconds')
# To compensate for the commulative error over 86,000 secs !
sleep(sleep_secs*0.95)
sleep(secs_till_805())
That is, it cuts the sleep time in half repeatedly, until less than 10
seconds remain.  It can sleep for hours at a time, but as the target
time approaches it wakes up more frequently.  This should keep the
program loaded in memory as the target time gets near.
 

Cheers
Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Global presets ?

2004-12-04 Thread Dave S
Brian van den Broek wrote:
Hi Dave, Kent, and all,
I have a caution about the
 from Config import *
idiom that Kent didn't mention.
It can lead to namespace pollution, in that if you have a module 'foo' 
with a name 'bar' and you are witting a script which says
 from foo import *
you have to be very careful that your script doesn't also assign to 
the name 'bar', else you may end up thinking you have two different 
things available when you don't. ('bar' will either point to your 
script's bar or to Config.bar, depending on whether you imported 
Config before or after your scripts assignment to bar.)

The first time this bites you, it can eat up hours of your life. (But 
I'm not bitter;-)

I avoid this by using the
 import examplemodule as em
That imports everything so that you accesses it by
 em.some_name
rather than
 examplemodule.some_name
I find that really handy for the handful of utility modules I import 
into most of my scripts. Then, I just have to be sure to avoid a small 
set of names -- 'em' in this case. And my python files have nice 
descriptive names, but I only have to type then once.

Best,
Brian vdB
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor
Thanks for pointing that out. Im a bit of a Python coward and opted for
from config import data_dir,HTML_addr
Mainly so I can see where these variables come from. I have never seen 
the 'as' operator on an import before, so much to learn (and remember) ;-)

Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Accuracy of time.sleep()

2004-12-05 Thread Dave S
Jacob S. wrote:
You know, since time.sleep() builds up errors, this is what I do to keep it
purely pythonic... (not tested)
from time import gmtime
alarmhr = 8
alarmmin = 5
alarmsec = 0
while 1:
   t = gmtime()
   hour = t[3]
   min = t[4]
   sec = t[5]
   if (alarmhr,alarmmin,alarmsec) == (hour,min,sec):
   print "It is 8:05 AM. Please do whatever you are supposed to at this
time.
   raw_input()
   break
 

Yep this is an option that makes sense to me, getting time once & 
breaking it down with []'s to avoid the trap I almost fell into. I know 
cron is probarbly the way to go but  Its kind of nice to keep it all 
Python if you know what I mean ;-)

Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Python structure advice ?

2004-12-15 Thread Dave S
Im sorry to bang on about Python structure, but I do struggle with it, 
having in the past got into very bad habits with loads of BASIC where 
everything was global, and Forth, and hand coded 8031, 8051, 6502  I 
cant get my head round how you guys handle a modern structured language 
:-) 

(PS before anyone flames me - I think Python is great and am determined 
to learn it ;-) )

I have ended up with my application in several separate directories.
I have 'live_datad' a demon that extracts web data, at preset times and 
archives it, this will be run as a thread, and possible using a queue 
... (still digesting info from query about IPCing)

I have a 'data_core' which accepts data from either live_datad real time 
or the archive for testing, it builds up a large multi dimensional array 
with various pointers into the array.

I have a statistical module 'data_stats' which analises the array 
pulling various stats.

And finally I have an analytical module 'data_predict' which using the 
output from 'data_stats' & data directly from the 'data_core' outputs 
statistical predictions of future data.

I have written my 'live_datad', I have written my 'data_core' & have a 
fairly good idea how to write the rest.

My problem is that pretty much all the modules need to fix where they 
are when they exit and pick up from that point later on, ie more data 
comes from live_datad, it is passed to 'data_core' which updates the 
matrix, then 'data_stats' then 'data_predict'  all called form the main 
script.  This OK till the main script realizes that more data is 
avalible from 'live_datad', passes it to 'data_core' which must remember 
where it was and move on, and the same for the rest of the modules. To 
make the problem more acute the modules may not be called in exactly the 
same order depending on what I am trying to achieve.

The 'remembering where is was' seems a continuous stumbling block for 
me. I have though of coding each module as a class but this seems like a 
cheat. I could declare copious globals, this seems messy, I could define 
each module as a thread & get them talking via queues, given this 
serious thought but heeded warning in previous posts. I have thought 
about returning an list of saved 'pointers' which would be re-submitted 
when the function is called. I don't know which way to turn.

With my code now running to a few hundred lines (Don't laugh this is BIG 
for me :-D ) I am going to have to make a structure decision and any 
suggestions would be appreciated.

How would you approach it ?
Dave

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-16 Thread Dave S
Dave S wrote:
Im sorry to bang on about Python structure, but I do struggle with it, 
having in the past got into very bad habits with loads of BASIC where 
everything was global, and Forth, and hand coded 8031, 8051, 6502  
I cant get my head round how you guys handle a modern structured 
language :-)
(PS before anyone flames me - I think Python is great and am 
determined to learn it ;-) )

I have ended up with my application in several separate directories.
I have 'live_datad' a demon that extracts web data, at preset times 
and archives it, this will be run as a thread, and possible using a 
queue ... (still digesting info from query about IPCing)

I have a 'data_core' which accepts data from either live_datad real 
time or the archive for testing, it builds up a large multi 
dimensional array with various pointers into the array.

I have a statistical module 'data_stats' which analises the array 
pulling various stats.

And finally I have an analytical module 'data_predict' which using the 
output from 'data_stats' & data directly from the 'data_core' outputs 
statistical predictions of future data.

I have written my 'live_datad', I have written my 'data_core' & have a 
fairly good idea how to write the rest.

My problem is that pretty much all the modules need to fix where they 
are when they exit and pick up from that point later on, ie more data 
comes from live_datad, it is passed to 'data_core' which updates the 
matrix, then 'data_stats' then 'data_predict'  all called form the 
main script.  This OK till the main script realizes that more data is 
avalible from 'live_datad', passes it to 'data_core' which must 
remember where it was and move on, and the same for the rest of the 
modules. To make the problem more acute the modules may not be called 
in exactly the same order depending on what I am trying to achieve.

The 'remembering where is was' seems a continuous stumbling block for 
me. I have though of coding each module as a class but this seems like 
a cheat. I could declare copious globals, this seems messy, I could 
define each module as a thread & get them talking via queues, given 
this serious thought but heeded warning in previous posts. I have 
thought about returning an list of saved 'pointers' which would be 
re-submitted when the function is called. I don't know which way to turn.

With my code now running to a few hundred lines (Don't laugh this is 
BIG for me :-D ) I am going to have to make a structure decision and 
any suggestions would be appreciated.

How would you approach it ?
Dave

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Having written this email, it has put my thoughts in order, though it 
seems a bit cheaty, wouldn't defining all modules that have to remember 
their internal state as classes be the best bet ?

Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-17 Thread Dave S
Kent Johnson wrote:
Dave S wrote:
Dave S wrote:
The 'remembering where is was' seems a continuous stumbling block 
for me. I have though of coding each module as a class but this 
seems like a cheat. I could declare copious globals, this seems 
messy, I could define each module as a thread & get them talking via 
queues, given this serious thought but heeded warning in previous 
posts. I have thought about returning an list of saved 'pointers' 
which would be re-submitted when the function is called. I don't 
know which way to turn.

Having written this email, it has put my thoughts in order, though it 
seems a bit cheaty, wouldn't defining all modules that have to 
remember their internal state as classes be the best bet ?

Dave

Why do you say this is 'cheaty'? A class is basically a collection of 
data (state) and functions to operate on that state.
Sorry for the delay, real world work got in the way ...
Well I understand classes to be used when multiple instances are 
required, I will only need one instance and as such it seemed a bit of a 
cheat, The trouble is I now pretty well understand the tools, but don't 
know how you guys use them in the real world.

You might be interested in this essay:
http://www.pycs.net/users/323/stories/15.html

I found this particularly usefull,
It might well make sense to organize your program as a collection of 
cooperating classes, or maybe a collection of classes with a top-level 
function that stitches them all together.

Yes, this is the way I see things progressing, from 20,000ft this makes 
a lot of sense.

You might also want to learn about iterator classes and generator 
functions, they are a technique for returning a bit of data at a time 
while maintaining state. You might be able to structure your input 
stage as an iterator or generator.
http://docs.python.org/tut/node11.html#SECTION001190
http://docs.python.org/lib/typeiter.html
I remeber iterators from 'learning python', I was concerned about 
several modules all 'having a iterator' to the next, debuging would be 
scary ! I think I will go the class route.

Kent
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-17 Thread Dave S
Sorry for the delay, real world work took me away ...
everything was global, how you guys handle a modern structured
language
   

Don't worry this is one of the hardest bad habits to break.
You are not alone. The easiest way is to just pass the data
from function to function in the function parameters. Its not
at all unusual for functions to have lots of parameters, "global"
programmers tend to panic when they have more than a couple,
 

yep !
but its not at all bad to have 5 or 6 - more than that gets
unweildy I admit and is usually time to start thinking about
classes and objects.
 

I have ended up with my application in several separate directories.
   

Separate modules is good. Separate directories for anything
other than big programs (say 20 or more files?) is more hassle
than its worth. The files are better kept in a single directory
IMHO. The exception being modules designed for reuse...
It just makes life simpler!
 

Ive tried to be hyper organized and added my dirs in
/usr/lib/python2.3/site-packages/mypath.pth
/home/dave/mygg/gg1.3/live_datad
/home/dave/mygg/gg1.3/logger
/home/dave/mygg/gg1.3/utils
/home/dave/mygg/gg1.3/datacore
/home/dave/mygg/gg1.3
/home/dave/mygg/gg1.3/configs
This works OK but I sometimes have to search around a bit to find where 
the modules are.

Probarby part of the problem is I tend to write lots of small modules, 
debug them & then import them into one controlling script, It works OK 
but I start to drown in files, eg my live_datad contains ...

exact_sleep.py   garbage_collect.py   gg ftsed.e3p  html_strip.py   
live_datad.py  valid_day.pyc
exact_sleep.pyc  garbage_collect.pyc  gg ftsed.e3s  html_strip.pyc  
valid_day.py

When I get more experienced I will try & write fewer, bigger modules :-)
 

My problem is that pretty much all the modules need to fix where
   

they
 

are when they exit and pick up from that point later on,
   

There are two "classic" approaches to this kind of problem:
1) batch oriented - each step of the process produces its own
output file or data structure and this gets picked up by the
next stage. Tis usually involved processing data in chunks
- writing the first dump after every 10th set of input say.
This is a very efficient way of processing large chuinks of
data and avoids any problems of synchronisation since the
output chunks form the self contained input to the next step.
And the input stage can run ahead of the processing or the
processing aghead of the input. This is classic mainframe
strategy, ideal for big volumes. BUT it introduces delays
in the end to end process time, its not instant.
 

I see your point, like a static chain, one calling the next & passing 
data, the problem being that the links of the chain will need to 
remember their previous state when called again, so their output is a 
function of previous data + fresh data. I guess their state could be 
written to a file, then re-read.

2) Real time serial processing, typically constructs a
processing chain in a single process. Has a separate thread
reading the input data 

Got that working live_datad ...
and kicks off a separate processing
thread (or process) for each bit of data received. Each
thread then processes the data to completion and writes
the output.
OK
A third process or thread then assembles the
outputs into a single report.
 

Interesting ...
This produces results quickly but can overload the computer
if data starts to arrive so fast that the threads start to
back up on each other. Also error handling is harder since
with the batch job data errors can be fixed at the
intermediate files but with this an error anywhere means
that whole data processing chain will be broken with no way
to fix it other than resubmitting the initial data.
 

An interesting idea, I had not thought of this approach as an option 
even with its stated drawbacks. Its given me an idea for some scripting 
I have to do later on ...

With my code now running to a few hundred lines
(Don't laugh this is BIG for me :-D )
   

Its big for me in Python, I've only writtenone program with
more than a thousand lines of Python wheras I've written
many C/C++ programs in ecess of 10,000 lines 

Boy am I glad I chose to learn Python rather than C++, probarbly still 
be at 'hello world' ;-)

and worked
on several of more than a million lines. But few if any
Python programs get to those sizes.
HTH,
Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld

 

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-17 Thread Dave S
Jeff Shannon wrote:
Dave S wrote:
Kent Johnson wrote:
Why do you say this is 'cheaty'? A class is basically a collection 
of data (state) and functions to operate on that state.

Sorry for the delay, real world work got in the way ...
Well I understand classes to be used when multiple instances are 
required, I will only need one instance and as such it seemed a bit 
of a cheat, The trouble is I now pretty well understand the tools, 
but don't know how you guys use them in the real world.

For what it's worth, it seems to me to be perfectly normal to have 
classes that are only ever intended to have a single instance.  For 
example, you're never likely to need more than one HTML parser, and 
yet htmllib.HTMLParser is a class...
Well if its good enough for a Python lib ...
As Kent said, the main point of a class is that you have a collection 
of data and operations on that data bundled together.  Whether you 
have one set of data to operate on, or many such sets, is mostly 
irrelevant (though classes are even more valuable when there *are* 
many sets of data).  Defining a class isn't so much a statement that 
"I want lots of things like this", as it is a declaration of 
modularity -- "This stuff all belongs together as a unit".

OK Im a reformed ('L' plate programmer) its going to be classes :-)
Jeff Shannon
Technician/Programmer
Credit International
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-17 Thread Dave S
Alan Gauld wrote:
1) batch oriented - each step of the process produces its own
output file or data structure and this gets picked up by the
next stage. Tis usually involved processing data in chunks
- writing the first dump after every 10th set of input say.
 

I see your point, like a static chain, one calling the next &
   

passing
 

data, the problem being that the links of the chain will need to
remember their previous state when called again, so their output is
   

a
 

function of previous data + fresh data. I guess their state could be
written to a file, then re-read.
   

Yes. Just to expand: the typical processing involves three files:
1) the input which is the output of the preceding stage
2) the output which will form input to the next stage
3) the job log. This will contain references to any input data
items that failed to process - typically these will be manually
inspected, corrected and a new file created and submitted at the
end of the batch run.
BUT 3) will also contain the sequence number of the last file and/or
last data item processed so that when the next cycle runs it knows
where to start. It is this belt and braces approach to data
processing and error recovery that makes mainframes so reliable,
not just the hardware, but the whole culture there is geared to
handling failure and being able to *recover* not just report on it.
After all its the mainframes where the really mission critical
software of any large enterprise runs!
As an ex Unix head I learned an awful lot about reliable computing
from the 18 months I spent working on a mainframe project. These
guys mostly live in a highly specialised microcosm of their own
but they have learned a lot of powerful tricks over the last 40
years that the rest of us ignore at our peril. I strongly
recommend that anyone who gets the chance of *a short* contract
in mainframe land, with training, to grab the opportunity with
both hands!
< Steps off soapbox now :-) >
Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld

 

You get on that soapbox whenever you want :-) , its good to hear a range 
of views !

Dave
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python structure advice ?

2004-12-17 Thread Dave S
Kent Johnson wrote:
Dave S wrote:
Separate modules is good. Separate directories for anything
other than big programs (say 20 or more files?) is more hassle
than its worth. The files are better kept in a single directory
IMHO. The exception being modules designed for reuse...
It just makes life simpler!
 

Ive tried to be hyper organized and added my dirs in
/usr/lib/python2.3/site-packages/mypath.pth
/home/dave/mygg/gg1.3/live_datad
/home/dave/mygg/gg1.3/logger
/home/dave/mygg/gg1.3/utils
/home/dave/mygg/gg1.3/datacore
/home/dave/mygg/gg1.3
/home/dave/mygg/gg1.3/configs
This works OK but I sometimes have to search around a bit to find 
where the modules are.

Probarby part of the problem is I tend to write lots of small 
modules, debug them & then import them into one controlling script, 
It works OK but I start to drown in files, eg my live_datad contains ...

exact_sleep.py   garbage_collect.py   gg ftsed.e3p  html_strip.py   
live_datad.py  valid_day.pyc
exact_sleep.pyc  garbage_collect.pyc  gg ftsed.e3s  html_strip.pyc  
valid_day.py

When I get more experienced I will try & write fewer, bigger modules :-)

It's just a guess from the filenames, but it looks like your 
live_datad package (directory) contains everything needed by 
live_datad.py. 
Spot on
I would like to suggest a different organization.
I tend to organize packages around a single functional area, and by 
looking at the dependencies of the modules in the package on other 
packages.

For example, in my current project some of the packages are:
- common.util - this is a catchall for modules that are not specific 
to this application, and don't depend on any other packages
- common.db - low-level database access modules
- cb.data - application-specific database access - the data objects 
and data access objects that the application works with
- cb.import - modules that import legacy data into the application
- cb.writer - modules that generate files
- cb.gui - GUI components
- cb.app - application-level drivers and helpers

I have been getting in a muddle, html_strip.py, strips HTML, mines for 
data & when it finds specific patterns returns a dictionary containing them.

However I also use one of its functions in a utility convert_data.py 
reading in archived semi-processed HTML files.  This cross dependance 
has occured several times and is getting messy, yours is an interesting 
approach, Its started me thinking...

Anyway, the point is, if you organize your modules according to what 
they do, rather than by who uses them, you might make a structure that 
is less chaotic.

HTH
Kent
___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Expanding a Python script to include a zcat and awk pre-process

2010-01-09 Thread Dave Angel

galaxywatc...@gmail.com wrote:
After 
many more hours of reading and testing, I am still struggling to 
finish this simple script, which bear in mind, I already got my 
desired results by preprocessing with an awk one-liner.


I am opening a zipped file properly, so I did make some progress, but 
simply assigning num1 and num2 to the first 2 columns of the file 
remains elusive. Num3 here gets assigned, not to the 3rd column, but 
the rest of the entire file. I feel like I am missing a simple strip() 
or some other incantation that prevents the entire file from getting 
blobbed into num3. Any help is appreciated in advance.


#!/usr/bin/env python

import string
import re
import zipfile
highflag = flagcount = sum = sumtotal = 0
f = file("test.zip")
z = zipfile.ZipFile(f)
for f in z.namelist():
ranges = z.read(f)
This reads the whole file into ranges.  In your earlier incantation, you 
looped over the file, one line at a time.  So to do the equivalent, you 
want to do a split here, and one more

nesting of loops.
   lines = z.read(f).split("\n")#build a list of text lines
   for ranges in lines:#here, ranges is a single line

and of course, indent the remainder.

ranges = ranges.strip()
num1, num2, num3 = re.split('\W+', ranges, 2)  ## This line is the 
root of the problem.

sum = int(num2) - int(num1)
if sum > 1000:
flag1 = " "
flagcount += 1
else:
flag1 = ""
if sum > highflag:
highflag = sum
print str(num2) + " - " + str(num1) + " = " + str(sum) + flag1
sumtotal = sumtotal + sum

print "Total ranges = ", sumtotal
print "Total ranges over 10 million: ", flagcount
print "Largest range: ", highflag

==
$ zcat test.zip
134873600, 134873855, "32787 Protex Technologies, Inc."
135338240, 135338495, 40597
135338496, 135338751, 40993
201720832, 201721087, "12838 HFF Infrastructure & Operations"
202739456, 202739711, "1623 Beseau Regional de la Region Languedoc 
Roussillon"






___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] what is the equivalent function to strtok() in c++

2010-01-10 Thread Dave Angel

sudhir prasad wrote:

hi,
what is the equivalent function to strtok() in c++,
what i need to do is to divide a line into different strings and store them
in different lists,and write them in to another file

  
If your tokens are separated by whitespace, you can simply use a single 
call to split().  It will turn a single string into a list of tokens.


line = "Now   is the time"
print line.split()

will display the list:
['Now', 'is', 'the', 'time']

HTH
DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Keeping a list of attributes of a certain type

2010-01-14 Thread Dave Angel
(You top-posted, which puts your two comments out of order.  Now the 
solution comes before the problem statement)


Guilherme P. de Freitas wrote:

Ok, I got something that seems to work for me. Any comments are welcome.


class Member(object):
def __init__(self):
pass


class Body(object):
def __init__(self):
self.members = []

def __setattr__(self, obj, value):
if isinstance(value, Member):
self.members.append(obj)
object.__setattr__(self, obj, value)
else:
object.__setattr__(self, obj, value)

def __delattr__(self, obj):
if isinstance(getattr(self, obj), Member):
self.members.remove(obj)
object.__delattr__(self, obj)
else:
object.__delattr__(self, obj)



john = Body()
john.arm = Member()
print(john.members)
del john.arm
print(john.members)


On Wed, Jan 13, 2010 at 6:24 PM, Guilherme P. de Freitas
 wrote:
  

Hi everybody,

Here is my problem. I have two classes, 'Body' and 'Member', and some
attributes of 'Body' can be of type 'Member', but some may not. The
precise attributes that 'Body' has depend from instance to instance,
and they can be added or deleted. I need any instance of 'Body' to
keep an up-to-date list of all its attributes that belong to the class
'Member'. How do I do this?

Best,

Guilherme

--
Guilherme P. de Freitas
http://www.gpfreitas.com


If this is a class assignment, you've probably got the desired answer.  
But if it's a real-world problem, there are tradeoffs, and until those 
are known, the simplest solution is usually the best.  Usually, what's 
desired is that the object behaves "as if" it has an up-to-date list of...


If order of the list doesn't matter, I'd consider simply writing a 
single method, called 'members' which does a list comprehension of the 
object's attributes, calculating the list when needed.  Then use a 
decorator to make this method look like a read-only data attribute  
(untested):


class Body (object):
@property
 def members(self):
return  [obj for .   if  ]

DaveA



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Searching in a file

2010-01-15 Thread Dave Angel

Paul Melvin wrote:

Hi,

Thanks very much to all your suggestions, I am looking into the suggestions
of Hugo and Alan.

The file is not very big, only 700KB (~2 lines), which I think should be
fine to be loaded into memory?

I have two further questions though please, the lines are like this:


Revenge
(2011)



5 days 


65 minutes 

Etc with a chunk (between each NEW) being about 60 lines, I need to extract
info from these lines, e.g. /browse/post/5354361/ and Revenge (2011) to pass
back to the output, is re the best option to get all these various bits,
maybe a generic function that I pass the search strings too?

And if I use the split suggestion of Alan's I assume the last one would be
the rest of the file, would the next() option just let me search for the
next /browse/post/5354361/ etc after the NEW? (maybe putting this info into
a list)

  
One way to handle "the rest of the file" is to add a marker at the end 
of the data.  So if you read the whole thing with readlines(), you can 
append another "NEW" so that all matches are between one NEW and the next.

Thanks again

paul

  
If this file is valid html, or xml, then perhaps you should use one of 
the html or xml parsing tools, rather than anything so esoteric as 
regex.  In any case, it now appears that NEW won't necessarily be 
unique, so you might want to start with  'alt="NEW"'  or something like 
that.  A key question becomes whether this data was automatically 
generated, or whether it might have variations from one sample to the 
next.  (for example,  alt ="NEW"  with different spacing.  or  
ALT="NEW")  And whether it's definitely valid html, or just close.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Replacing the string in a file

2010-01-22 Thread Dave Angel



vanam wrote:

Thanks for your mail.

As you have suggested i have  changed the mode to 'rw' but it is
throwing up an error as below

***
IOError: [Errno 22] invalid mode ('rw') or filename: 'data.txt'
***
I am using python 2.6.4.

But Script is managed to pass with 'a+' mode/r+ mode.

log = open('data.txt','r+/a+')
for x in log:
 x = x.replace('Python','PYTHON')
 print x,
log.close()

It had properly written and replaced Python to PYTHON.

Thanks for your suggestion.

  

  
That won't work.  Better test it some more.  Without some form of 
write() call, you're not changing the file.



There are several workable suggestions in this thread, and I think 
fileinput is the easiest one.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [File Input Module]Replacing string in a file

2010-01-28 Thread Dave Angel

vanam wrote:

Hi all,

As it was suggested before in the mailing list about the query
regarding replacing string in the file, i have used the module File
input for replacing the string in the file.

For understanding and execution purpose, i have just included Python
as a string in the file and want it to be replaced to PYTHON.

Below are my queries and code: (Correct me if my understanding is wrong???)

1))

import fileinput
x = fileinput.input('data.txt',inplace=0)
for line in x:
 line = line.replace('Python','PYTHON)
 print line,
x.close()

The above piece of code will not create any backup file but  it will
replace PYTHON (Print on the console) but not physically write to the
file.

2)))

import fileinput
x = fileinput.input('data.txt',inplace=1)
for line in x:
line = line.replace('Python','PYTHON')
print line,
x.close()

The above piece of code will create backup file but hidden (in the
form of bak file) and it will physically write to the file -- I have
verified the contents of data.txt after the file operation and it had
written successfully.But why it is not printing line i.e. string in
the file on the console.

  
When you use the inplace=true option, it will redirect standard output 
to the file.  So print is going there, and *not* to the console.  I 
don't know whether close() restores the original console or not.

3)))

import fileinput
x = fileinput.input('data.txt',inplace=1)
for line in x:
line = line.replace('Python','PYTHON')
x.close()

The above piece of code after execution is wiping out the full
contents. But addition of print line, is exactly replacing the string,
what exactly addition of print is making difference???

  

See above.  Since you print nothing to sys.stdout, the output file is empty.

4)))

import fileinput
x = fileinput.input('data.txt',inplace=1,backup='content.txt')
for line in x:
line = line.replace('Python','PYTHON')
print line,
x.close()

The above piece is creating a backup file by name data.txtcontent.txt
(I am not sure whether created file name is correct or not?) and to
the back up file it had added previous content i.e., Python and it had
replaced the contents in data.txt to PYTHON

5)))

Suppose if data.txt has string Python written in Font size 72 and when
i display the string on the console ie. by below piece of code

import fileinput
x = fileinput.input('data.txt',inplace=0)
for line in x:
  print line,
x.close()

It wouldnt print with the same Font size on the console (This wont
prove anything wrong as the same font could be backed with a different
file name)

  
Text files have no concept of fonts or color.  Sometimes there are extra 
annotations in a file (eg. escape sequences) which can be interpreted by 
particular software as commands to change font, or change color, or even 
to reposition.  Examples of this would be html, postscript, rich-text, 
and ANSI escape sequences.


But those escape sequences will only be meaningful to a program that 
understands them.  So if you print html files out to the console, you'll 
see lots of angle brackets and such, rather than seeing the pretty 
display intended to show in a browser.  If you print to a console, it's 
up to that console to process some escape sequences (eg. ANSI) or not.

Do let me know if my understanding is correct.
  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] hash value input

2010-01-30 Thread Dave Angel

spir wrote:

On Fri, 29 Jan 2010 08:23:37 -0800
Emile van Sebille  wrote:

  

So, how does python do this?
 
  

Start here...

http://effbot.org/zone/python-hash.htm



Great, thank you!
From the above pointed page:

For ordinary integers, the hash value is simply the integer itself (unless 
it’s -1).

class int:
def __hash__(self):
value =elf
if value =-1:
value =-2
return value

I'm surprised of this, for this should create as many indexes (in the underlying array 
actually holding the values) as there are integer keys. With possibly huge holes in the 
array. Actually, there will certainly be a predefined number of indexes N, and the 
integers be further "modulo-ed" N. Or what?
I would love to know how to sensibly chose the number of indexes. Pointers 
welcome (my searches did not bring any clues on the topic).

  

Emile



Denis


la vita e estrany

http://spir.wikidot.com/

  
I haven't seen the sources, so I'm making an educated guess based on 
things I have seen. The index size grows as the size of the dictionary 
grows, and the formula is not linear. Neither are the sizes obvious 
powers of two or suchlike. I doubt if you have any control over it, 
however. The hash() function returns an int (32 bits on 32bit python), 
which is then converted to the bucket number, probably by a simple 
modulo function.


In the case of integers, it's the modulo which distributes the integers 
among the buckets. If all the integer keys are consecutive, then modulo 
distributes them perfectly. If they're random, then it'll usually work 
pretty well, but you could hit a pattern which puts lots of values in 
one bucket, and not many in the others. If the index size is 22, and all 
your numbers are multiple of 22, then it might degenerate to effectively 
one bucket.


BTW, the referenced article does have a contradiction. For a long int 
whose value is between 16 and 31 bits, the described approach will not 
generate the same hash as the int of the same value. So that 15 bit 
shift algorithm must have some other subtlety to it, perhaps only 
starting with bit 31 or so.


DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] parse text file

2010-02-02 Thread Dave Angel

Norman Khine wrote:

thanks denis,

On Tue, Feb 2, 2010 at 9:30 AM, spir  wrote:
  

On Mon, 1 Feb 2010 16:30:02 +0100
Norman Khine  wrote:



On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson  wrote:
  

On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine  wrote:



thanks, what about the whitespace problem?
  

\s* will match any amount of whitespace includin newlines.


thank you, this worked well.

here is the code:

###
import re
file=en('producers_google_map_code.txt', 'r')
data =repr( file.read().decode('utf-8') )

block =e.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
b =lock.findall(data)
block_list =]
for html in b:
  namespace =}
  t =e.compile(r"""(.*)<\/strong>""")
  title =.findall(html)
  for item in title:
  namespace['title'] =tem
  u =e.compile(r"""a href=\"\/(.*)\">En savoir plus""")
  url =.findall(html)
  for item in url:
  namespace['url'] =tem
  g =e.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")
  lat =.findall(html)
  for item in lat:
  namespace['LatLng'] =tem
  block_list.append(namespace)

###

can this be made better?
  

The 3 regex patterns are constants: they can be put out of the loop.

You may also rename b to blocks, and find a more a more accurate name for 
block_list; eg block_records, where record =et of (named) fields.

A short desc and/or example of the overall and partial data formats can greatly 
help later review, since regex patterns alone are hard to decode.



here are the changes:

import re
file=en('producers_google_map_code.txt', 'r')
data =repr( file.read().decode('utf-8') )

get_record =e.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
get_title =e.compile(r"""(.*)<\/strong>""")
get_url =e.compile(r"""a href=\"\/(.*)\">En savoir plus""")
get_latlng =e.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")

records =et_record.findall(data)
block_record =]
for record in records:
namespace =}
titles =et_title.findall(record)
for title in titles:
namespace['title'] =itle
urls =et_url.findall(record)
for url in urls:
namespace['url'] =rl
latlngs =et_latlng.findall(record)
for latlng in latlngs:
namespace['latlng'] =atlng
block_record.append(namespace)

print block_record
  

The def of "namespace" would be clearer imo in a single line:
   namespace =title:t, url:url, lat:g}



i am not sure how this will fit into the code!

  

This also reveals a kind of name confusion, doesn't it?


Denis




Your variable 'file' is hiding a built-in name for the file type.  No 
harm in this example, but it's a bad habit to get into.


What did you intend to happen if the number of titles, urls, and latIngs 
are not each exactly one?  As you have it now, if there's more than one, 
you spend time adding them all to the dictionary, but only the last one 
survives.  And if there aren't any, you don't make an entry in the 
dictionary.


If that's the exact behavior you want, then you could replace the loop 
with an if statement:   (untested)


if titles:
namespace['title'] = titles[-1]


On the other hand, if you want a None in your dictionary for missing 
information, then something like:  (untested)


for record in records:


titles = get_title.findall(record)
title = titles[-1] if titles else None
urls = get_url.findall(record)
url = urls[-1] if urls else None
latlngs = get_latlng.findall(record)
lating = latings[-1] if latings else None
block_record.append( {'title':title, 'url':url, 'lating':lating{ )


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Question about importing

2010-02-02 Thread Dave Angel

Eike Welk wrote:

On Tuesday February 2 2010 20:28:03 Grigor Kolev wrote:
  

Can I use something like this
#--
import sys
sys.path.append("/home/user/other")
import module
#-




Yes I think so. I just tried something similar:
--


IPython 0.10 -- An enhanced Interactive Python.

<--- snip >

In [1]: import sys

In [2]: 
sys.path.append("/home/eike/codedir/freeode/trunk/freeode_py/freeode/")


<--- snip >
<--- The next line is a special command of IPython: >

In [8]: !ls /home/eike/codedir/freeode/trunk/freeode_py/freeode/
ast.py   pygenerator.pyctest_1_interpreter.pyc   
test_pygenerator.pyc
ast.pyc  simlcompiler.pytest_2_interpreter.py  
test_simlcompiler.py
__init__.py  simlcompiler.pyc   test_2_interpreter.pyc 

<--- snip >



In [9]: import simlcompiler
---
ImportError   Traceback (most recent call last)

/home/eike/ in ()

/home/eike/codedir/freeode/trunk/freeode_py/freeode/simlcompiler.py in 
()

 36 import stat
 37 from subprocess import Popen #, PIPE, STDOUT
---> 38 import pyparsing
 39 import freeode.simlparser as simlparser
 40 import freeode.interpreter as interpreter

ImportError: No module named pyparsing


--
Well... the import fails, but it finds the module and starts to import it. 



HTH,
Eike.



  
I have no idea what freode looks like, but I have a guess, based on your 
error messages.


I'd guess that you want to append without the freeode directory:


sys.path.append("/home/eike/codedir/freeode/trunk/freeode_py/")

and import with it.  That's because freeode is a package name, not a 
directory name (I can tell because __init__.py is present)

 import freeode.simlcompiler

See if that works any better.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] SYS Long File Names?

2010-02-07 Thread Dave Angel

FT wrote:

Hi!

I was looking at the sys.argv(1) file name and it is the short 8 char
name. How do you place it into the long file name format? I was reading
music files and comparing the name to the directory listing and always comes
back as not found because the name was shortened.

So, how do you get it to read long file names?

Bruce


  
You need square brackets, not parentheses on the sys.argv.  But I'm 
guessing that's a typo in your message.


Some more information, please.  What version of Python, and what OS ?  
And how are you running this script?  If you're typing the script name 
at a DOS box, then the string you're seeing in sys.argv[1] is the one 
you typed on the command line.


If you start it some other way, please tell us how.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] NameError: global name 'celsius' is not defined (actually, solved)

2010-02-10 Thread Dave Angel

David wrote:
Hello 
Wesley,


thanks for your reply. I was surprised about the limited information 
too. Sadly (?), I can't reproduce the error any more...


David



On 10/02/10 11:13, wesley chun wrote:
I just wrote this message, but after restarting ipython all worked 
fine.
How is it to be explained that I first had a namespace error which, 
after a
restart (and not merely a new "run Sande_celsius-main.py"), went 
away? I

mean, surely the namespace should not be impacted by ipython at all!?
 :
# file: Sande_celsius-main.py
from Sande_my_module import c_to_f
celsius = float(raw_input("Enter a temperature in Celsius: "))
fahrenheit = c_to_f(celsius)
print "That's ", fahrenheit, " degrees Fahrenheit"

# this is the file Sande_my_module.py
# we're going to use it in another program
def c_to_f(celsius):
fahrenheit = celsius * 9.0 / 5 + 32
return fahrenheit

When I run Sande_celsius-main.py, I get the following error:

NameError: global name 'celsius' is not defined
WARNING: Failure executing file:



Python interpreters including the standard one or IPython should tell
you a lot more than that. how are you executing this code? would it be
possible to do so from the command-line? you should get a more verbose
error message that you can post here.

best regards,
-- wesley


Your response to Wesley should have been here, instead of at the top.  
Please don't top-post on this forum.


I don't use iPython, so this is just a guess.  But perhaps the problem 
is that once you've imported the code, then change it, it's trying to 
run the old code instead of the changed code.


Try an experiment in your environment.  Deliberately add an error to 
Sande_celsius-main.py, and run it.  Then correct it, and run it again, 
to see if it notices the fix.


The changes I'd try in this experiment are to first change the name on 
the celsius= line tocelsius2=
and after running and getting the error, change the following line to 
call celsius2().  If it gets an error, notice what symbol it complains 
about.


HTH
DaveA




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] A Stuborn Tab Problem in IDLE

2010-02-14 Thread Dave Angel

Wayne Watson wrote:
I got to 
the dos command line facility and got to the file. I executed the 
program, and it failed with a syntax error. I can't copy it out of the 
window to paste here, but here's the code surrounding the problem: 
(arrow ==> points at the problem.
The console code shows [ missing. I SEE the syntax error. It's two 
lines above the line with the arrow. The code now works. Thanks very 
much. Console wins again!


 (I suspect you are not into matplotlib, but the plot requires a list 
for x and y in plot(x,y). xy[0,0] turns out to be a float64, which the 
syntax rejects. I put [] around it, and it works. Is there a better way?


ax1.plot([xy[0,0]],[xy[0,1]],'gs')
if npts == 90: # exactly 90 frames
ax1.plot([xy[npts-1,0]], xy[npts-1,1]],'rs') # mark it is 
a last frame

else:
ax1.plot([xy[npts-1,0]], ==>[xy[npts-1,1]],'ys') # mark 
90th frame in path

last_pt = len(xy[:,0])
ax1.plot([xy[npts-1,0]],[xy[npts-1,1]],'rs')

On 2/14/2010 6:18 PM, Wayne Watson wrote:
Well, command line was easy to get to. It's on the menu for python, 
but it gives me >>>.  How do I get to the folder with the py file?  
Can I switch to a c:\  type operation?


Back to exploring.

On 2/14/2010 5:05 PM, Alan Gauld wrote:


"Wayne Watson"  wrote
When I use F5 to execute a py program in IDLE, Win7, I get a tab 
error on an indented else. 


What happens if you execute from a command line? Do you get the same 
error?

If so look at the lines before.
If not try closing and restarting IDLE

HTH,

Alan G

Once you've discovered the DOS box, you should also discover QuickEdit 
mode.  In the DOS box, right click on the title bar, and choose 
"Properties".  First tab is Options.  Enable Quick-Edit mode, and press 
OK.  Now, you can drag a rectangle on the DOS box, and use right click 
to paste it to the clipboard. Practice a bit and you'll find it easy.  
An essential tool.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Getting caller name without the help of "sys._getframe(1).f_code.co_name" ?

2010-02-15 Thread Dave Angel

patrice laporte wrote:

2010/2/14 Luke Paireepinart 

  

I see why you would want the error messages but why is the default error
message not enough, that is why I am curious, and typically introspection on
objects is not necessary (for example, people often want to convert a string
into a variable name to store a value (say they have the string "foobar1"
and they want to store the value "f" in the variable "foobar1", how do they
change foobar1 to reference a string?  well you can just use exec but the
core issue is that there's really no reason to do it in the first place,
they can just use a dictionary and store dict['foobar1'] = 'f'  and it is
functionally equivalent (without the danger in the code)).  I get the
feeling that your issue is the same sort of thing, where an easier solution
exists but for whatever reason you don't see it.  I don't know if this is
true or not.  Here's my take on this:



class x(object):
  

def __init__(self, fname):
self.temp = open(fname).read()




a = x('foobar')
  

Traceback (most recent call last):
  File "", line 1, in# this is the
module it's in
a = x('foobar')#
this is the line where I tried to initialize it
  File "", line 3, in __init__  # which called
this function, which is the one that has the error
self.temp = open(fname).read()  #and the error
occurred while trying to perform this operation
IOError: [Errno 2] No such file or directory: 'foobar'  #and the error was
that the file 'foobar' could not be found.




Hi and thank to everybody...

First of all, I consider my first question is now answered : I wanted to get
rid of that sys._getframe call, I got an explanation (thanks to Kent).

The rest of the discussion is not about Python, it's more about the way of
thinking how to help the user having à good feeling with your app.

I try to clarify my need and share you my anxiety. Of course, a lot of thing
are available with Python, from a coder point of view. But what I want to do
is to think about the user, and give him a way to understand that what he
did was wrong.

Traceback give me all I need, but my opinion is that it's not acceptable to
give it back to the user without a minimum of "décorating". I didn't yet
look at the logging module, and maybe it can help me to make that
décorating.

And the user must be a priority (it's still my conviction here)

My own experience is that there is too much coder that forget the app they
work on is aim to be used by "real human", not by C/C++/Python/put what ever
you want here/ guru : if your app popups to the user a message that is just
what the traceback gave, it's not a good thing : How can it be reasonable to
imagine the user will read that kinda message ? :

*Traceback (most recent call last):
  File "", line 1, in 
a = x('foobar')
  File "", line 3, in __init__
self.temp = open(fname).read()
IOError: [Errno 2] No such file or directory: 'foobar'
*

Of course the origin of his problem is in the message : "*No such file or
directory: 'foobar'*", but a customer will never read that @ù^$#é uggly
message, there is too much extraterrestrial words in it.

Traceback doesn' give more thant that, it doesn't say, as an example : we
(the name of app) was trying to open the file "foobar" in order to do
something with it (put here what it was supposed to do with the file) : app
failed to open it because "foobar" doen't exist.

According to me, traceback is what we need during "coding" phases, but it's
not something to give to the user.

This problem has to be solved by thinking the app in the way I'm trying to
explain  (but not only in that way) : think about the user. This is not
something I expect Python to do for me, I'm just looking for everything
Python can provide me to make me think about the user.

I'm new to Python, and I make a lot of exploration to understand and answer
myself to my question. Python library is huge, and I don't have as enough
time as I wanted to deal with it. But I'm conviced the solutions are here, I
don't try to re-invent the wheel...


Thant to you all.

  
This makes lots of sense.  If the message doesn't make sense to the 
user, there's no point.  But why then is your thread titled "Getting 
caller name" ?  Why does the user care about the caller (function) 
name?  When you started the thread, it seemed clear that your user was a 
programmer, presumably who was adding code to your system and who wanted 
to see context error messages in coding terms.


If you have a duality of users, consider using a "DEBUG" variable, that 
changes the amount of detail you display upon an error.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Problem with "input" in Python 3

2010-02-15 Thread Dave Angel

Peter Anderson wrote:

Hi!

I am trying to teach myself how to program in Python using Zelle's 
"Python Programming: An Introduction to Computer Science" (a very good 
text). At the same time I have decided to start with Python 3 (3.1.1). 
That means that I have to convert Zelle's example code to Python 3 
(which generally I cope with).


I'm hoping that somebody can help with what's probably a very simple 
problem. There is a quadratic equation example involving multiple user 
inputs from the one "input" statement. The code works fine with Python 
2.5 but when I convert it to Python 3 I get error messages. The code 
looks like:


05 import math
06
07 def main():
08 print("This program finds the real solutions to a quadratic\n")
09
10 a, b, c = input("Please enter the coefficients (a, b, c): ")
11
12 '''
13 a = int(input("Please enter the first coefficient: "))
14 b = int(input("Please enter the second coefficient: "))
15 c = int(input("Please enter the third coefficient: "))
16 '''
17
18 discrim = b * b - 4 * a * c
19 ...

25 main()

Lines 08 to 12 are my Python 3 working solution but line 06 does not 
work in Python 3. When it runs it produces:


Please enter the coefficients (a, b, c): 1,2,3
Traceback (most recent call last):
File "C:\Program Files\Wing IDE 101 
3.2\src\debug\tserver\_sandbox.py", line 25, in 
File "C:\Program Files\Wing IDE 101 
3.2\src\debug\tserver\_sandbox.py", line 10, in main

builtins.ValueError: too many values to unpack
>>>

Clearly the problem lies in the input statement. If I comment out line 
10 and remove the comments at lines 12 and 16 then the program runs 
perfectly. However, I feel this is a clumsy solution.


Could somebody please guide me on the correct use of "input" for 
multiple values.


Regards,
Peter
The input() function in Python3 produces a string, and does not evaluate 
it into integers, or into a tuple, or whatever.  See for yourself by trying


  print ( repr(input("prompt ")) )

on both systems.


You can subvert Python3's improvement by adding an eval to the return value.
  a, b, c = eval(input("Enter exactly three numbers, separated by commas"))

is roughly equivalent to Python 2.x  input expression.  (Python 3's 
input is equivalent to Python 2.x  raw_input)


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Input() is not working as expected in Python 3.1

2010-02-15 Thread Dave Angel

Yaraslau Shanhin wrote:

Hello All,

I am working with Python tutorial in wiki and one of the exercises is as
follows:

Ask the user for a string, and then for a number. Print out that string,
that many times. (For example, if the string is hello and the number is 3 you
should print out hellohellohello.)

Solution for this exercise is:

text = str(raw_input("Type in some text: "))
number = int(raw_input("How many times should it be printed? "))print
(text * number)



Since in Python raw_input() function was renamed to input() according
to PEP 3111  I have
respectively updated this code to:


text = str(input("Type in some text: "))
number = int(input("How many times should it be printed? "))print
(text * number)



However when I try to execute this code in Python 3.1 interpreter
error message is generated:


Type in some text: some
How many times should it be printed? 3
Traceback (most recent call last):
  File "test4.py", line 2, in 
number = int(input("How many times should it be printed? "))
ValueError: invalid literal for int() with base 10: 'How many times
should it be printed? 3'


Can you please advise me how to resolve this issue?

  
When I correct for your missing newline, it works for me.  I don't know 
of any version of Python which would copy the prompt string into the 
result value of input or raw_input() function.


Try pasting the exact console session, rather than paraphrasing it.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] fast sampling with replacement

2010-02-21 Thread Dave Angel



Luke Paireepinart wrote:

Can you explain what your function is doing and also post some test code to
profile it?

On Sat, Feb 20, 2010 at 10:22 AM, Andrew Fithian  wrote:

  

Hi tutor,

I'm have a statistical bootstrapping script that is bottlenecking on a
python function sample_with_replacement(). I wrote this function myself
because I couldn't find a similar function in python's random library. This
is the fastest version of the function I could come up with (I used
cProfile.run() to time every version I wrote) but it's not fast enough, can
you help me speed it up even more?

import random
def sample_with_replacement(list):
l = len(list) # the sample needs to be as long as list
r = xrange(l)
_random = random.random
return [list[int(_random()*l)] for i in r] # using
list[int(_random()*l)] is faster than random.choice(list)

FWIW, my bootstrapping script is spending roughly half of the run time in
sample_with_replacement() much more than any other function or method.
Thanks in advance for any advice you can give me.

-Drew

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



list and l are poor names for locals.  The former because it's a 
built-in type, and the latter because it looks too much like a 1.


You don't say how big these lists are, but I'll assume they're large 
enough that the extra time spent creating the 'l' and 'r' variables is 
irrelevant.


I suspect you could gain some speed by using random.randrange instead of 
multiplying random.random by the length.


And depending on how the caller is using the data, you might gain some 
by returning a generator expression instead of a list.  Certainly you 
could reduce the memory footprint.


I wonder why you assume the output list has to be the same size as the 
input list.  Since you're sampling with replacement, you're not using 
the whole list anyway.  So I'd have defined the function to take a 
second argument, the length of desired array.  And if you could accept a 
generator instead of a list, you don't care how long it is, so let it be 
infinite.


(untested)
def sample(mylist):
   mylistlen = len(mylist)
   randrange = random.randrange
   while True:
 yield mylist[ randrange(0, mylistlen)]

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-23 Thread Dave Angel


Wayne Watson wrote:
A few days ago I posted a message titled ""Two" Card Monty. The 
problem I mentioned looks legitimate, and remains puzzling. I've 
probed this in a newsgroup, and no one has an explanation that fits.


My claim is that if one creates a program in a folder that reads a 
file in the folder it and then copies it to another folder, it will 
read  the data file in the first folder, and not a changed file in the 
new folder. I'd appreciate it if some w7 users could try the program 
below, and let me know what they find.  I'm using IDLE in Win7 with Py 
2.5.


My experience is that if one checks the properties of the copied file, 
it will point to the original py file and execute it and not the copy. 
If win7 is the culprit, I would think this is a somewhat  serious 
problem. It may be the sample program is not representative of the 
larger program that has me stuck. If necessary I can provide it. It 
uses common modules. (Could this be something like the namespace usage 
of variables that share a common value?)


# Test program. Examine strange link in Python under Win7
# when copying py file to another folder.
# Call the program vefifywin7.py
# To verify my situation use IDLE, save and run this program there.
# Put this program into a folder along with a data file
# called verify.txt. Create a single text line with a few characters 
in it

# Run this program and note the output
# Copy the program and txt file to another folder
# Change the contents of the txt file
# Run it again, and see if the output is the same as in the other folder
track_file = open("verify.txt")
aline = track_file.readline();
print aline
track_file.close()

I find your English is very confusing.  Instead of using so many 
pronouns with confusing antecedents, try being explicit.


>My claim is that if one creates a program in a folder that reads a 
file in the folder


Why not say that you created a program and a data file in the same 
folder, and had the program read the data file?


>...in the folder it and then copies it to another folder

That first 'it' makes no sense, and the second 'it' probably is meant to 
be "them".  And who is it that does this copying?  And using what method?


> ... it will read  the data file in the first folder

Who will read the data file?  The first program, the second, or maybe 
the operator?


About now, I have to give up.  I'm guessing that the last four lines of 
your message were intended to be the entire program, and that that same 
program is stored in two different folders, along with data files having 
the same name but different first lines.  When you run one of these 
programs it prints the wrong version of the line.


You have lots of variables here, Python version, program contents, Idle, 
Windows version.  Windows 7 doesn't do any mysterious "linking," so I'd 
stop making that your working hypothesis.  Your problem is most likely 
the value of current directory ( os.getcwd() ).  And that's set 
according to at least three different rules, depending on what program 
launches Python.  If you insist on using Idle to launch it, then you'll 
have to convince someone who uses Idle to tell you its quirks.   Most 
likely it has a separate menu for the starting directory than for the 
script name & location.  But if you're willing to use the command line, 
then I could probably help, once you get a clear statement of the 
problem.  By default, CMD.EXE uses the current directory as part of its 
prompt, and that's the current directory Python will start in.


But the first things to do are probably to print out the value of  
os.getcwd(), and to add a slightly different print in each version of 
the program so you know which one is running.


Incidentally, I'd avoid ever opening a data file in "the current 
directory."  If I felt it important to use the current directory as an 
implied parameter to the program, I'd save it in a string, and build the 
full path to the desired file using  os.path.join() or equivalent.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Strange list behaviour in classes

2010-02-25 Thread Dave Angel

James Reynolds wrote:

Thank you! I think I have working in the right direction. I have one more
question related to this module.

I had to move everything to a single module, but what I would like to do is
have this class in a file by itself so I can call this from other modules.
when it was in separate modules it ran with all 0's in the output.

Here is the code in one module:

#import Statistics

class Statistics:
def __init__(self, *value_list):
self.value = value_list
self.square_list= []
 def mean(self, *value_list):
try :
ave = sum(self.value) / len(self.value)
except ZeroDivisionError:
ave = 0
return ave

def median(self, *value_list):
if len(self.value) <= 2:
n = self.mean(self.value)
elif len(self.value) % 2 == 1:
m = (len(self.value) - 1)/2
n = self.value[m+1]
else:
m = len(self.value) / 2
m = int(m)
n = (self.value[m-1] + self.value[m]) / 2
return n
 def variance(self, *value_list):
average = self.mean(*self.value)
for n in range(len(self.value)):
square = (self.value[n] - average)**2
self.square_list.append(square)
try:
var = sum(self.square_list) / len(self.square_list)
except ZeroDivisionError:
var = 0
return var

def stdev(self, *value_list):
var = self.variance(*self.value)
sdev = var**(1/2)
return sdev
 def zscore(self, x, *value_list):
average = self.mean(self.value)
sdev = self.stdev(self.value)
try:
z = (x - average) / sdev
except ZeroDivisionError:
z = 0
return z



a = [1,2,3,4,5,6,7,8,9,10]
stats = Statistics(*a)
mean = stats.mean(*a)
median = stats.median(*a)
var = stats.variance(*a)
stdev = stats.stdev(*a)
z = stats.zscore(5, *a)
print(mean, median, var, stdev, z)
print()



On Wed, Feb 24, 2010 at 7:33 PM, Alan Gauld wrote:

  

"James Reynolds"  wrote

 I understand, but if self.value is any number other then 0, then the "for"


will append to the square list, in which case square_list will always have
some len greater than 0 when "value" is greater than 0?

  

And if value does equal zero?

Actually I'm confused by value because you treat it as both an
integer and a collection in different places?


 Is this an occasion which is best suited for a try:, except statement? Or


should it, in general, but checked with "if's". Which is more expensive?

  

try/except is the Python way :-)


 def variance(self, *value_list):


  if self.value == 0:
   var = 0
  else:
average = self.mean(*self.value)
for n in range(len(self.value)):
 square = (self.value[n] - average)**2
 self.square_list.append(square)
   var = sum(self.square_list) / len(self.square_list)
   return var

  

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/



The indentation in your code is lost when I look for it --  everything's 
butted up against the left margin except for a single space before def 
variance.  This makes it very hard to follow, so I've ignored the thread 
till now.  This may be caused by the mail digest logic, or it may 
because you're posting online, and don't tell it to leave the code 
portion unformatted.  But either way, you should find a way to leave the 
code indented as Python would see it.  If you're posting by mail, be 
sure and send it as text.


But a few things I notice in your code:   You keep using the * notation 
on your formal parameters.  That's what turns a list into a tuple.  And 
you pass those lists into methods  (like median()) which already have 
access to the data in the object, which is very confusing.  If the 
caller actually passes something different there, he's going to be 
misled, since the argument is ignored.


Also, in method variance() you append to the self.square_list.  So if it 
gets called more than once, the list will continue to grow.  Since 
square_list is only referenced within the one method, why not just 
define it there, and remove it as a instance attribute?


If I were you, I'd remove the asterisk from both the __init__() method 
parameter, and from the caller in top-level code.  You're building a 
list, and passing it.  Why mess with turning it into multiple arguments, 
and then back to a tuple?   Then I'd remove the spurious arguments to 
mean(), variance(), stdev() and zscore().  There are a few other things, 
but this should make it cleaner.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-27 Thread Dave Angel



Wayne Watson wrote:
Ok, I'm back after a three day trip. You are correct about the use of 
pronouns and a few misplaced words. I should have reread what I wrote. 
I had described this in better detail elsewhere, and followed that 
description with the request here probably thinking back to it.  I 
think I was getting a bit weary of trying to find an answer. Try t;his.



Folder1
   track1.py
  data1.txt
  data2.txt
  data3.txt

Folder2
   track1.py
   dset1.txt
   dset2.txt
   ...
   dset8.txt

So how do you know this is the structure?  If there really are shortcuts 
or symbol links, why aren't you showing them?   Did you do a DIR from 
the command line, to see what's there?  Or are you looking in Explorer, 
which doesn't even show file extensions by default, and just guessing 
what's where ?


data and dset files have the same record formats. track1.py was copied 
into  Folder2 with ctrl-c + ctrl-v. 


Those keys don't work from a command prompt.  From there, you'd use COPY 
or something similar.  So I have to guess you were in an Explorer 
window, pointing to Folder 1, and you selected the python file, and 
pressed Ctrl-C.  Then you navigated to Folder 2, and pressed Ctrl-V.  If 
you did,  Windows 7 wouldn't have created any kind of special file, any 
more than earlier ones did.  Chances are you actually did something 
else.  For example, you might have used a right-click drag, and answered 
"create shortcut" when it asked what you wanted to do.  Or perhaps you 
did drag/drop with some ctrl-or alt-key modifier.


Anyway, you need to be more explicit about what you did.  If you had 
used a command prompt, you could at least have pasted the things you 
tried directly to your message, so we wouldn't have so much guessing to do.
When I run track1.py from folder1,  it clearly has examined the 
data.txt  files. 
And how are you running track1.py ?  And how do you really know that's 
what ran?  The code you posted would display a string, then the window 
would immediately go away, so you couldn't read it anyway.
If I run the copy of track1.py in folder2, it clearly operates on 
folder1 (one) data.txt files. This should not be.


If I look at  the  properties of track1.py in folder2  (two), it  is  
pointing back to the program in folder1 (one).
Exactly what do you mean by "pointing back" ?  If you're getting icons 
in your Explorer view, is there a little arrow in the corner?  When you 
did the properties, did you see a tab labeled "shortcut" ?



I do not believe I've experienced this sort of linkage in any WinOS 
before. I believed I confirmed that the same behavior occurs using cmd 
prompt.


Shortcuts have been in Windows for at least 20 years.  But you still 
haven't given enough clues about what you're doing.

I'll now  head for Alan's reply.

On 2/23/2010 5:35 PM, Dave Angel wrote:


Wayne Watson wrote:
A few days ago I posted a message titled ""Two" Card Monty. The 
problem I mentioned looks legitimate, and remains puzzling. I've 
probed this in a newsgroup, and no one has an explanation that fits.


My claim is that if one creates a program in a folder that reads a 
file in the folder it ... then copies it to another folder, it will 
read  the data file in the first folder, and not a changed file in 
the new folder. I'd appreciate it if some w7 users could try the 
program below, and let me know what they find.  I'm using IDLE in 
Win7 with Py 2.5.


My experience is that if one checks the properties of the copied 
file, it will point to the original py file and execute it and not 
the copy. If win7 is the culprit, I would think this is a somewhat  
serious problem. It may be the sample program is not representative 
of the larger program that has me stuck. If necessary I can provide 
it. It uses common modules. (Could this be something like the 
namespace usage of variables that share a common value?)


# Test program. Examine strange link in Python under Win7
# when copying py file to another folder.
# Call the program vefifywin7.py
# To verify my situation use IDLE, save and run this program there.
# Put this program into a folder along with a data file
# called verify.txt. Create a single text line with a few characters 
in it

# Run this program and note the output
# Copy the program and txt file to another folder
# Change the contents of the txt file
# Run it again, and see if the output is the same as in the other 
folder

track_file = open("verify.txt")
aline = track_file.readline();
print aline
track_file.close()

I find your English is very confusing.  Instead of using so many 
pronouns with confusing antecedents, try being explicit.


>My claim is that if one creates a program in a folder that reads a 
file in the folder


Why not say that you created a program and a data file in the same 
folder, and had the program read the dat

Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-28 Thread Dave Angel



Wayne Watson wrote:



You tell us to "try this" and give a folder structure:

Folder1
 track1.py
 data1.txt
 data2.txt
 data3.txt
Folder2
 track1.py
 dset1.txt
 dset2.txt
 ...
 dset8.txt



Maybe one simple test at a time will get better responses.  Since you 
wouldn't tell me what tabs you saw in Explorer when you did properties, 
maybe you'll tell me what you see in CMD.


Go to a cmd prompt (DOS prompt), change to the Folder2 directory, and 
type dir.   paste that result, all of it, into a message.  I suspect 
you'll see that you don't have track1.py there at all, but track1.py.lnk


If so, that's a shortcut.  The only relevant change in Win7 that I know 
of is that Explorer shows shortcuts as "link" rather than "shortcut."



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Verifying My Troublesome ...+Properties

2010-02-28 Thread Dave Angel

Wayne Watson wrote:
(I sent the msg below to Steven and the list a moment ago, since msgs 
going to the list with attachments either don't post or take lots of 
time to post, I'm sending both of you this copy.)


Steven, attached are three jpg files showing the properties of the two 
py files. The two files are identical in name, ReportingToolA.py,  and 
content, but are in folders .../Events2008_NovWW and .../events. Two 
of the jpg files are for the General Tab and Shortcut Tab of the py 
file in ../events. The other jpg is for the file in 
.../Events2008_NovWW, which has no shortcut tab. In previous 
descriptions, this is like:


Folder1 is Events2008_NovWW
Folder2 is events

I developed RT.py (ReportingToolA.py) in the .../Events2008_NovWW 
folder and copied it to ../events. The shortcut shows the events 
folder RT.py file is really in Events20008_WW


I have no idea why the RT.py file shows a shortcut. I just took a file 
called junk.jpg, and right-clicked on it. I selected Shortcut from the 
list, and it produced a file junk.jpg-shortcut. It is quite obvious 
the file name is different. If I select Copy instead, and paste the 
file into a folder called  Junk, there is no shortcut created. A drag 
and drop results in a move,and not a copy, so that's out of the picture.


I have no idea how the RT.py file ever got to be a shortcut.
As I said many messages ago, if your Properties dialog has a tab called 
Shortcut, then this is a shortcut file, not a python file.  I still 
don't know how you created it, but that's your "anomaly," not Windows 7, 
and certainly not Python.  Further, the name isn't  RT.py, since 
shortcuts have other extensions (such as .lnk) that Explorer hides from 
you, in its infinite "helpfulness."  It does give you several clues, 
however, such as the little arrow in the icon.  You can see that without 
even opening the properties window, but it's repeated in that window as 
well.


And Explorer is just a tool.  The command prompt should be your home 
base as a programmer.  When something goes wrong running a program from 
the either other ways, always check it at the command prompt, because 
every other tool has quirks it introduces into the equation.


My best guess on how you created that shortcut was by using Alt-Drag.  
As you point out, drag does a move by default, if it's on the same 
drive.  Ctrl-Drag will force a copy, even on the same drive.  And 
Shift-Drag will force a move, even if it's on a different drive.


These rules didn't change between XP and Windows 7, as far as I know, 
although in some places Explorer calls it "Link" instead of 
"Shortcut".   But that's just a self inconsistency.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] List comprehension possible with condition statements?

2010-03-03 Thread Dave Angel

Jojo Mwebaze wrote:

Hi There,

i would like to implement the following in lists

assuming

x = 3
y = 4
z = None

i want to create a dynamic list such that

mylist = [ x , y, z ] ,   if z in not None

if z is None then

mylist = [x,y]

Anyhelp!

cheers

Jojo

  


Are there any constraints on x and y ?  If you want to throw out all 
None values, then it's a ready problem.  You try it, and if it doesn't 
quite work, post the code. We'll try to help.


But if only the third value is special, then there's little point in 
making a comprehension of one value.  Just conditionally append the z 
value to the list containing x and y.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Bowing out

2010-03-03 Thread Dave Angel

Kent Johnson wrote:

Hi all,

After six years of tutor posts my interest and energy have waned and
I'm ready to move on to something new. I'm planning to stop reading
and contributing to the list. I have handed over list moderation
duties to Alan Gauld and Wesley Chun.

Thanks to everyone who contributes questions and answers. I learned a
lot from my participation here.

So long and keep coding!
Kent

  
I'm sorry to see you go as well.  I've learned an awful lot from your 
posts over the couple of years I've been here.


Thanks for all the efforts.
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting algorithm

2010-03-03 Thread Dave Angel

C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the thread 
it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who described 
how to go about sorting. As it turned out the description he described 
was the Bubble sort algorithm. Since coding the solution I know the 
Bubble sort is inefficient because of repeated iterations over the 
entire list. This shed light on the quick sort algorithm which I'd 
like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list into 
smaller lists. Sort each list, then go back and use the same swapping 
procedure for each of the different lists. My question is, at what 
point to you start breaking things up? Is that based on list elements 
or is it based on memory(?) resources python is using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including a 
video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great learning 
exercise to implement Python code directly from that description, 
without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not going 
to quibble about whether this is the "right" one to be called that.)


I don't know what your earlier thread was, since you don't mention the 
subject line, but there are a number of possible reasons you might not 
have wanted to use the built-in sort.  The best one is for educational 
purposes.  I've done my own sort for various reasons in the past, even 
though I had a library function, since the library function had some 
limits.  One time I recall, the situation was that the library sort was 
limited to 64k of total data, and I had to work with much larger arrays 
(this was in 16bit C++, in "large" model).  I solved the size problem by 
using the  C++ sort library on 16k subsets (because a pointer was 2*2 
bytes).  Then I merged the results of the sorts.  At the time, and in 
the circumstances involved, there were seldom more than a dozen or so 
sublists to merge, so this approach worked well enough.


Generally, it's better for both your development time and the efficiency 
and reliabilty of the end code, to base a new sort mechanism on the 
existing one.  In my case above, I was replacing what amounted to an 
insertion sort, and achieved a 50* improvement for a real customer.  It 
was fast enough that other factors completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to your 
other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the so 
called Order Function.  In a bubble sort, if you double the size of the 
array, you quadruple the number of comparisons and swaps.  It's order 
N-squared or O(n*n).   So what works well for an array of size 10 might 
take a very long time for an array of size 1 (like a million times 
as long).  You can do much better by sorting smaller lists, and then 
combining them together.  Such an algorithm can  be O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, the 
answer is when the list is size 3 or more.  For anything larger than 2, 
you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general case, 
that's undefined.  I think it's safe in this case, but I would avoid it 
anyway, by just using xrange(len(list_)-1) to iterate through it.  You 
use the index function to find something you would already know -- the 
index function is slow.  And the first try/except isn't needed if you 
use a -1 in the xrange argument, as I do above.


You use pop() and push() to exchange two adjacent items in the list.  
Both operations copy the remainder of the list, so they're rather slow.  
Since you're exchanging two items in the list, you can simply do that:

list[pos1], list[pos2] = list[pos2], list[pos1]

That also eliminates the need for the second try/except.

You mention being bothered by the while loop.  You could replace it with 
a simple for loop with xrange(len(l

Re: [Tutor] Encoding

2010-03-03 Thread Dave Angel

Giorgio wrote:


 Depends on your python version. If you use python 2.x, you have to use a
  

u before the string:

s = u'Hallo World'




Ok. So, let's go back to my first question:

s = u'Hallo World' is unicode in python 2.x -> ok
s = 'Hallo World' how is encoded?

  

Since it's a quote literal in your source code, it's encoded by your 
text editor when it saves the file, and you tell Python which encoding 
it was by the second line of your source file, right after the shebang line.


A sequence of bytes in an html file should be should have its encoding 
identified by the tag at the top of the html file.  And I'd  *guess* 
that on a form result, the encoding can be assumed to match that of the 
html of the form itself.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting algorithm

2010-03-03 Thread Dave Angel
(You forgot to do a Reply-All, so your message went to just me, rather 
than to me and the list )



C.T. Matsumoto wrote:

Dave Angel wrote:

C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the 
thread it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who described 
how to go about sorting. As it turned out the description he 
described was the Bubble sort algorithm. Since coding the solution I 
know the Bubble sort is inefficient because of repeated iterations 
over the entire list. This shed light on the quick sort algorithm 
which I'd like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list into 
smaller lists. Sort each list, then go back and use the same 
swapping procedure for each of the different lists. My question is, 
at what point to you start breaking things up? Is that based on list 
elements or is it based on memory(?) resources python is using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including a 
video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great 
learning exercise to implement Python code directly from that 
description, without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not 
going to quibble about whether this is the "right" one to be called 
that.)


I don't know what your earlier thread was, since you don't mention 
the subject line, but there are a number of possible reasons you 
might not have wanted to use the built-in sort.  The best one is for 
educational purposes.  I've done my own sort for various reasons in 
the past, even though I had a library function, since the library 
function had some limits.  One time I recall, the situation was that 
the library sort was limited to 64k of total data, and I had to work 
with much larger arrays (this was in 16bit C++, in "large" model).  I 
solved the size problem by using the  C++ sort library on 16k subsets 
(because a pointer was 2*2 bytes).  Then I merged the results of the 
sorts.  At the time, and in the circumstances involved, there were 
seldom more than a dozen or so sublists to merge, so this approach 
worked well enough.


Generally, it's better for both your development time and the 
efficiency and reliabilty of the end code, to base a new sort 
mechanism on the existing one.  In my case above, I was replacing 
what amounted to an insertion sort, and achieved a 50* improvement 
for a real customer.  It was fast enough that other factors 
completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to your 
other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the so 
called Order Function.  In a bubble sort, if you double the size of 
the array, you quadruple the number of comparisons and swaps.  It's 
order N-squared or O(n*n).   So what works well for an array of size 
10 might take a very long time for an array of size 1 (like a 
million times as long).  You can do much better by sorting smaller 
lists, and then combining them together.  Such an algorithm can  be 
O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, 
the answer is when the list is size 3 or more.  For anything larger 
than 2, you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general case, 
that's undefined.  I think it's safe in this case, but I would avoid 
it anyway, by just using xrange(len(list_)-1) to iterate through it.  
You use the index function to find something you would already know 
-- the index function is slow.  And the first try/except isn't needed 
if you use a -1 in the xrange argument, as I do above.


You use pop() and push() to exchange two adjacent items in the list.  
Both operations copy the remainder of the list, so they're rather 
slow.  Since you're exchangin

Re: [Tutor] lazy? vs not lazy? and yielding

2010-03-03 Thread Dave Angel

John wrote:

Hi,

I just read a few pages of tutorial on list comprehenion and generator 
expression.  From what I gather the difference is "[ ]" and "( )" at the 
ends, better memory usage and the something the tutorial labeled as "lazy 
evaluation".  So a generator 'yields'.  But what is it yielding too?  


John

  
A list comprehension builds a whole list at one time.  So if the list 
needed is large enough in size, it'll never finish, and besides, you'll 
run out of memory and crash.  A generator expression builds a function 
instead which *acts* like a list, but actually doesn't build the values 
till you ask for them.  But you can still do things like

   for item in  fakelist:

and it does what you'd expect.


You can write a generator yourself, and better understand what it's 
about.  Suppose you were trying to build a "list" of the squares of the 
integers between 3 and 15.  For a list of that size, you could just use 
a list comprehension.  But pretend it was much larger, and you couldn't 
spare the memory or the time.


So let's write a generator function by hand, deliberately the hard way.

def mygen():
   i = 3
   while i < 16:
   yield i*i
   i += 1
   return

This function is a generator, by virtue of that yield statement in it.  
When it's called, it does some extra magic to make it easy to construct 
a loop.


If you now use
for item in mygen():
  print item

Each time through the loop, it executes one more iteration of the 
mygen() function, up to the yield statement.  And the value that's put 
into item comes from the yield statement.


When the mygen() function returns (or falls off the end), it actually 
generates a special exception that quietly terminates the for/loop.


Now, when we're doing simple expressions for a small number of values, 
we should use a list comprehension.  When it gets big enough, switch to 
a generator expression.  And if it gets complicated enough, switch to a 
generator function.  The point here is that the user of the for/loop 
doesn't care which way it was done.


Sometimes you really need a list.  For example, you can't generally back 
up in a generator, or randomly access the [i] item.  But a generator is 
a very valuable mechanism to understand.


For a complex example, consider searching a hard disk for a particular 
file.  Building a complete list might take a long time, and use a lot of 
memory.  But if you use a generator inside a for loop, you can terminate 
(break) when you meet some condition, and the remainder of the files 
never had to be visited.  See os.walk()


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Encoding

2010-03-03 Thread Dave Angel
(Don't top-post.  Put your response below whatever you're responding to, 
or at the bottom.)


Giorgio wrote:

Ok.

So, how do you encode .py files? UTF-8?

2010/3/3 Dave Angel 

  
I personally use Komodo to edit my python source files, and tell it to 
use UTF8 encoding.  Then I add a encoding line as the second line of the 
file.  Many times I get lazy, because mostly my source doesn't contain 
non-ASCII characters.  But if I'm copying characters from an email or 
other Unicode source, then I make sure both are set up.  The editor will 
actually warn me if I try to save a file as ASCII with any 8 bit 
characters in it.


Note:  unicode is 16 bit characters, at least in CPython 
implementation.  UTF-8 is an 8 bit encoding of that Unicode, where 
there's a direct algorithm to turn 16 or even 32 bit Unicode into 8 bit 
characters.  They are not the same, although some people use the terms 
interchangeably.


Also note:  An 8 bit string  has no inherent meaning, until you decide 
how to decode it into Unicode.  Doing explicit decodes is much safer, 
rather than assuming some system defaults.  And if it happens to contain 
only 7 bit characters, it doesn't matter what encoding you specify when 
you decode it.  Which is why all of us have been so casual about this.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] object representation

2010-03-04 Thread Dave Angel

spir wrote:

Hello,

In python like in most languages, I guess, objects (at least composite ones -- 
I don't know about ints, for instance -- someone knows?) are internally 
represented as associative arrays. Python associative arrays are dicts, which 
in turn are implemented as hash tables. Correct?
Does this mean that the associative arrays representing objects are implemented 
like python dicts, thus hash tables?

I was wondering about the question because I guess the constraints are quite 
different:
* Dict keys are of any type, including heterogeneous (mixed). Object keys are 
names, ie a subset of strings.
* Object keys are very stable, typically they hardly change; usually only 
values change. Dicts often are created empty and fed in a loop.
* Objects are small mappings: entries are explicitely written in code (*). 
Dicts can be of any size, only limited by memory; they are often fed by 
computation.
* In addition, dict keys can be variables, while object keys rarely are: they 
are literal constants (*).

So, I guess the best implementations for objects and dicts may be quite 
different. i wonder about alternatives for objects, in particuliar trie and 
variants: http://en.wikipedia.org/wiki/Trie, because they are specialised for 
associative arrays which keys are strings.

denis

PS: Would someone point me to typical hash funcs for string keys, and the one 
used in python?

  

http://effbot.org/zone/python-hash.htm

(*) Except for rare cases using setattr(obj,k,v) or obj.__dict__[k]=v.
  
Speaking without knowledge of the actual code implementing CPython (or 
for that matter any of the other dozen implementations), I can comment 
on my *model* of how Python "works."  Sometimes it's best not to know 
(or at least not to make use of) the details of a particular 
implementation, as your code is more likely to port readily to the next 
architecture, or even next Python version.


I figure every object has exactly three items in it:  a ref count, a 
implementation pointer, and a payload. The payload may vary between 4 
bytes for an int object, and about 4 megabytes for a list of size a 
million.  (And of course arbitrarily large for larger objects).


You can see that these add up to 12 bytes (in version 2.6.2 of CPython) 
for an int by using sys.getsizeof(92).  Note that if the payload refers 
to other objects, those sizes are not included in the getsizeof() 
function result.  So getsizeof(a list of strings) will not show the 
sizes of the strings, but only the list itself.


The payload for a simple object will contain just the raw data of the 
object.  So for a string, it'd contain the count and the bytes.  For 
compound objects that can change in size, it'd contain a pointer to a 
malloc'ed buffer that contains the variable-length data.  The object 
stays put, but the malloc'ed buffer may move as it size grows and 
shrinks.  getsizeof() is smart enough to report not only the object 
itself, but the buffer it references. Note that buffer is referenced by 
only one object, so its lifetime is intimately tied up with the object's.


The bytes in the payload are meaningless without the implementation 
pointer.   That implementation pointer will be the same for all 
instances of a particular type.  It points to a structure that defines a 
particular type (or class).  That structure for an empty class happens 
to be 452 bytes, but that doesn't matter much, as it only appears once 
per class.  The instance of an empty class is only 32 bytes.  Now, even 
that might seem a bit big, so Python offers the notion of slots, which 
reduces the size of each instance, at the cost of a little performance 
and a lot of flexibility.  Still, slots are important, because I suspect 
that's how built-ins are structured, to make the objects so small.


Now, some objects, probably most of the built-ins, are not extensible.  
You can't add new methods, or alter the behavior much.  Other objects, 
such as instances of a class you write, are totally and very flexible.  
I won't get into inheritance here, except to say that it can be tricky 
to derive new classes from built-in types.


So where do associative arrays come in?  One of the builtin types is a 
dictionary, and that is core to much of the workings of Python.  There 
are dictionaries in each class implementation (that 452 bytes I 
mentioned).  And there may be dictionaries in the instances themselves.  
There are two syntaxes to directly access these dictionaries, the "dot" 
notation and the bracket [] notation.  The former is a simple 
indirection through a special member called __dict__.


So the behavior of an object depends on its implementation pointer, 
which points to a  structure.  And parts of that structure ultimately 
point to C code whch does all the actual work.  But most of the work is 
some double- or triple-indirection which ultimately calls code.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or ch

Re: [Tutor] Encoding

2010-03-04 Thread Dave Angel



Giorgio wrote:

2010/3/4 spir 


Ok,so you confirm that:

s = u"ciao è ciao" will use the file specified encoding, and that

t = "ciao è ciao"
t = unicode(t)

Will use, if not specified in the function, ASCII. It will ignore the
encoding I specified on the top of the file. right?

  
A literal  "u" string, and only such a (unicode) literal string, is 
affected by the encoding specification.  Once some bytes have been 
stored in a 8 bit string, the system does *not* keep track of where they 
came from, and any conversions then (even if they're on an adjacent 
line) will use the default decoder.  This is a logical example of what 
somebody said earlier on the thread -- decode any data to unicode as 
early as possible, and deal only with unicode strings in the program.  
Then, if necessary, encode them into whatever output form immediately 
before (or while) outputting them.




Again, thankyou. I'm loving python and his community.

Giorgio




  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] object representation

2010-03-05 Thread Dave Angel

spir wrote:

On Thu, 04 Mar 2010 09:22:52 -0500
Dave Angel  wrote:

  
Still, slots are important, because I suspect 
that's how built-ins are structured, to make the objects so small.



Sure, one cannot alter their structure. Not even of a direct instance of 
:
  

o = object()
o.n=1


Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'object' object has no attribute 'n'

  
Now, some objects, probably most of the built-ins, are not extensible.  
You can't add new methods, or alter the behavior much.



This applies to any attr, not only methods, also plain "information":
  

s = "abc"
s.n=1


Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'str' object has no attribute 'n'


  
Other objects, 
such as instances of a class you write, are totally and very flexible.



conceptually this is equivalent to have no __slots__ slot. Or mayby they could 
be implemented using structs (which values would be pointers), instead of 
dicts. A struct is like a fixed record, as opposed to a dict. What do you 
think? On the implementation side, this would be much simpler, lighter, and 
more efficient.
Oh, this gives me an idea... (to implement so-called "value objects").

Denis
  
having not played much with slots, my model is quite weak there.  But I 
figure the dictionary is in the implementation structure, along with a 
flag saying that it's readonly.  Each item of such a dictionary would be 
an index into the fixed table in the object.  Like a struct, as you say, 
except that in C, there's no need to know the names of the fields at run 
time.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Encoding

2010-03-05 Thread Dave Angel

Giorgio wrote:


Ok,so you confirm that:

s = u"ciao è ciao" will use the file specified encoding, and that

t = "ciao è ciao"
t = unicode(t)

Will use, if not specified in the function, ASCII. It will ignore the
encoding I specified on the top of the file. right?



  

A literal  "u" string, and only such a (unicode) literal string, is
affected by the encoding specification.  Once some bytes have been stored in
a 8 bit string, the system does *not* keep track of where they came from,
and any conversions then (even if they're on an adjacent line) will use the
default decoder.  This is a logical example of what somebody said earlier on
the thread -- decode any data to unicode as early as possible, and deal only
with unicode strings in the program.  Then, if necessary, encode them into
whatever output form immediately before (or while) outputting them.





 Ok Dave, What i don't understand is why:

s = u"ciao è ciao" is converting a string to unicode, decoding it from the
specified encoding but

t = "ciao è ciao"
t = unicode(t)

That should do exactly the same instead of using the specified encoding
always assume that if i'm not telling the function what the encoding is, i'm
using ASCII.

Is this a bug?

Giorgio
  
In other words, you don't understand my paragraph above.  Once the 
string is stored in t as an 8 bit string, it's irrelevant what the 
source file encoding was.  If you then (whether it's in the next line, 
or ten thousand calls later) try to convert to unicode without 
specifying a decoder, it uses the default encoder, which is a 
application wide thing, and not a source file thing.  To see what it is 
on your system, use sys.getdefaultencoding().


There's an encoding specified or implied for each source file of an 
application, and they need not be the same.  It affects string literals 
that come from that particular file. It does not affect any other 
conversions, as far as I know.  For that matter, many of those source 
files may not even exist any more by the time the application is run.


There are also encodings attached to each file object, I believe, though 
I've got no experience with that.  So sys.stdout would have an encoding 
defined, and any unicode strings passed to it would be converted using 
that specification.


The point is that there isn't just one global value, and it's a good 
thing.  You should figure everywhere characters come into  your program 
(eg. source files, raw_input, file i/o...) and everywhere characters go 
out of your program, and deal with each of them individually.  Don't 
store anything internally as strings, and you won't create the ambiguity 
you have with your 't' variable above.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Bowing out

2010-03-05 Thread Dave Kuhlman
On Wed, Mar 03, 2010 at 08:17:45AM -0500, Kent Johnson wrote:
> Hi all,
> 
> After six years of tutor posts my interest and energy have waned and
> I'm ready to move on to something new. I'm planning to stop reading
> and contributing to the list. I have handed over list moderation
> duties to Alan Gauld and Wesley Chun.
> 
> Thanks to everyone who contributes questions and answers. I learned a
> lot from my participation here.
> 
> So long and keep coding!

Thank you Kent, for all you've done for those who came to this list
for help.

I admire your ability and knowledge about Python.  And, I
appreciate the huge amount of effort you've put into helping us.

- Dave


-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Encoding

2010-03-05 Thread Dave Angel

Giorgio wrote:

2010/3/5 Dave Angel 
  

In other words, you don't understand my paragraph above.




Maybe. But please don't be angry. I'm here to learn, and as i've run into a
very difficult concept I want to fully undestand it.


  
I'm not angry, and I'm sorry if I seemed angry.  Tone of voice is hard 
to convey in a text message.

Once the string is stored in t as an 8 bit string, it's irrelevant what the
source file encoding was.




Ok, you've said this 2 times, but, please, can you tell me why? I think
that's the key passage to understand how encoding of strings works. The
source file encoding affects all file lines, also strings.

Nope, not strings.  It only affects string literals.

 If my encoding is
UTF8 python will read the string "ciao è ciao" as 'ciao \xc3\xa8 ciao' but
if it's latin1 it will read 'ciao \xe8 ciao'. So, how can it be irrelevant?

I think the problem is that i can't find any difference between 2 lines
quoted above:

s = u"ciao è ciao"

and

t = "ciao è ciao"
c = unicode(t)

[**  I took the liberty of making the variable names different so I can refer 
to them **]
  
I'm still not sure whether your confusion is to what the rules are, or 
why the rules were made that way.  The rules are that an unqualified 
conversion, such as the unicode() function with no second argument, uses 
the default encoding, in strict mode.  Thus the error.


Quoting the help: 
"If no optional parameters are given, unicode() will mimic the behaviour 
of str() except that it returns Unicode strings instead of 8-bit 
strings. More precisely, if /object/ is a Unicode string or subclass it 
will return that Unicode string without any additional decoding applied.


For objects which provide a __unicode__() 
<../reference/datamodel.html#object.__unicode__> method, it will call 
this method without arguments to create a Unicode string. For all other 
objects, the 8-bit string version or representation is requested and 
then converted to a Unicode string using the codec for the default 
encoding in 'strict' mode.

"

As for why the rules are that, I'd have to ask you what you'd prefer.  
The unicode() function has no idea that t was created from a literal 
(and no idea what source file that literal was in), so it has to pick 
some coding, called the default coding.  The designers decided to use a 
default encoding of ASCII, because manipulating ASCII strings is always 
safe, while many functions won't behave as expected when given UTF-8 
encoded strings.  For example, what's the 7th character of t ?  That is 
not necessarily the same as the 7th character of s, since one or more of 
the characters in between might have taken up multiple bytes in s.  That 
doesn't happen to be the case for your accented character, but would be 
for some other European symbols, and certainly for other languages as well.

If you then (whether it's in the next line, or ten thousand calls later)
try to convert to unicode without specifying a decoder, it uses the default
encoder, which is a application wide thing, and not a source file thing.  To
see what it is on your system, use sys.getdefaultencoding().




And this is ok. Spir said that it uses ASCII, you now say that it uses the
default encoder. I think that ASCII on spir's system is the default encoder
so.


  
I don't know, but I think it's the default in every country, at least on 
version 2.6.  It might make sense to get some value from the OS that 
defined the locally preferred encoding, but then a program that worked 
fine in one locale might fail miserably in another.

The point is that there isn't just one global value, and it's a good thing.
 You should figure everywhere characters come into  your program (eg. source
files, raw_input, file i/o...) and everywhere characters go out of your
program, and deal with each of them individually.




Ok. But it always happen this way. I hardly ever have to work with strings
defined in the file.

  
Not sure what you mean by "the file."  If you mean the source file, 
that's what your examples are about.   If you mean a data file, that's 
dealt with differently.
  

Don't store anything internally as strings, and you won't create the
ambiguity you have with your 't' variable above.

DaveA




Thankyou Dave

Giorgio



  


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Encoding

2010-03-07 Thread Dave Angel

Giorgio wrote:

2010/3/7 spir 

  
One more question: Amazon SimpleDB only accepts UTF8.


So, let's say i have to put into an image file:

  
Do you mean a binary file with image data, such as a jpeg?  In that 
case, an emphatic - NO.  not even close.

filestream = file.read()
filetoput = filestream.encode('utf-8')

Do you think this is ok?

Oh, of course everything url-encoded then

Giorgio


  
Encoding binary data with utf-8 wouldn't make any sense, even if you did 
have the right semantics for a text file. 

Next problem, 'file' is a built-in keyword.  So if you write what you 
describe, you're trying to call a non-static function with a class 
object, which will error.



Those two lines don't make any sense by themselves.  Show us some 
context, and we can more sensibly comment on them.  And try not to use 
names that hide built-in keywords, or Python stdlib names.


If you're trying to store binary data in a repository that only permits 
text, it's not enough to pretend to convert it to UTF-8.  You need to do 
some other escaping, such as UUENCODE, that transforms the binary data 
into something resembling text.  Then you may or may not need to encode 
that text with utf-8, depending on its character set.



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Encoding

2010-03-07 Thread Dave Angel

Giorgio wrote:

2010/3/7 Dave Angel 

  

Those two lines don't make any sense by themselves.  Show us some context,
and we can more sensibly comment on them.  And try not to use names that
hide built-in keywords, or Python stdlib names.




Hi Dave,

I'm considering Amazon SimpleDB as an alternative to PGSQL, but i need to
store blobs.

Amazon's FAQs says that:

"Q: What kind of data can I store?
You can store any UTF-8 string data in Amazon SimpleDB. Please refer
to the Amazon
Web Services Customer Agreement <http://aws.amazon.com/agreement> for
details."

This is the problem. Any idea?


  

DaveA




Giorgio



  
You still didn't provide the full context.  Are you trying to do store 
binary data, or not?


Assuming you are, you could do the UUENCODE suggestion I made.  Or use 
base64:


base64.encodestring(/s/)   wlll turn binary data into (larger) binary 
data, also considered a string.  The latter is ASCII, so it's irrelevant 
whether it's considered utf-8 or otherwise.  You store the resulting 
string in your database, and use  base64.decodestring(s) to reconstruct 
your original.


There's 50 other ways, some more efficient, but this may be the simplest.

DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Communicate between a thread and the main program

2010-03-08 Thread Dave Angel

Plato P.B. wrote:

Hi all,
I have created a script in which i need to implement the communication
between the main program and a thread.
The thread looks for any newly created files in a particular directory. It
will be stored in a variable in the thread function. I want to get that name
from the main program.
How can i do it?

Thanks in Advance. :D
  
Don't store it in "a variable in the thread function," but in an 
instance attribute of the thread object.


Then the main program simply checks the object's attribute.  Since it 
launched the thread(s), it should know its (their) instances.  This way, 
the solution scales up as you add more threads, with different 
functionality.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting algorithm

2010-03-11 Thread Dave Angel

C.T. Matsumoto wrote:

Dave Angel wrote:
(You forgot to do a Reply-All, so your message went to just me, 
rather than to me and the list )



C.T. Matsumoto wrote:

Dave Angel wrote:

C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the 
thread it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who 
described how to go about sorting. As it turned out the 
description he described was the Bubble sort algorithm. Since 
coding the solution I know the Bubble sort is inefficient because 
of repeated iterations over the entire list. This shed light on 
the quick sort algorithm which I'd like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list 
into smaller lists. Sort each list, then go back and use the same 
swapping procedure for each of the different lists. My question 
is, at what point to you start breaking things up? Is that based 
on list elements or is it based on memory(?) resources python is 
using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including 
a video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great 
learning exercise to implement Python code directly from that 
description, without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not 
going to quibble about whether this is the "right" one to be called 
that.)


I don't know what your earlier thread was, since you don't mention 
the subject line, but there are a number of possible reasons you 
might not have wanted to use the built-in sort.  The best one is 
for educational purposes.  I've done my own sort for various 
reasons in the past, even though I had a library function, since 
the library function had some limits.  One time I recall, the 
situation was that the library sort was limited to 64k of total 
data, and I had to work with much larger arrays (this was in 16bit 
C++, in "large" model).  I solved the size problem by using the  
C++ sort library on 16k subsets (because a pointer was 2*2 bytes).  
Then I merged the results of the sorts.  At the time, and in the 
circumstances involved, there were seldom more than a dozen or so 
sublists to merge, so this approach worked well enough.


Generally, it's better for both your development time and the 
efficiency and reliabilty of the end code, to base a new sort 
mechanism on the existing one.  In my case above, I was replacing 
what amounted to an insertion sort, and achieved a 50* improvement 
for a real customer.  It was fast enough that other factors 
completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to 
your other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the 
so called Order Function.  In a bubble sort, if you double the size 
of the array, you quadruple the number of comparisons and swaps.  
It's order N-squared or O(n*n).   So what works well for an array 
of size 10 might take a very long time for an array of size 1 
(like a million times as long).  You can do much better by sorting 
smaller lists, and then combining them together.  Such an algorithm 
can  be O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, 
the answer is when the list is size 3 or more.  For anything larger 
than 2, you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general 
case, that's undefined.  I think it's safe in this case, but I 
would avoid it anyway, by just using xrange(len(list_)-1) to 
iterate through it.  You use the index function to find something 
you would already know -- the index function is slow.  And the 
first try/except isn't needed if you use a -1 in the xrange 
argument, as I do above.


You use pop() and push() to exchange two adjacent items in the 
list.  Both operations copy the remainder of the list, so they're 
rath

Re: [Tutor] sorting algorithm

2010-03-12 Thread Dave Angel

C.T. Matsumoto wrote:

I've change the code and I think I have what you were talking about.

def mysort(list_):

for i in xrange(0, len(list_)):

pos = i

for j in xrange(pos+1, len(list_)):

if list_[i] > list_[j]:

pos = j

list_[i], list_[j] = list_[j], list_[i]

I finally started to think that the while couldn't remain. But if I 
look at this the thing that I don't get is the 'xrange(pos+1, 
len(list_))' snippet. What confused me was how did a new position get 
passed xrange(), when I do not see where it that was happening. Is 
'pos' a reference to the original pos in the xrange snippet?


T

That loop is not what I was describing, but I think it's nearly 
equivalent in performance.  My loop was always swapping adjacent items, 
and it adjusted the ending limit as the data gets closer to sorted.  
This one adjusts the beginning value (pos) of the inner loop, as the 
data gets more sorted.  For some orderings, such as if the data is 
already fully sorted, my approach would  be much faster.


Your outer loop basically finds the smallest item in the list on each 
pass.  If the line pos=j didn't exist, the inner loop would always loop 
from the i+1 value to the end of the list.  But since we've already done 
a bunch of comparisons on the previous pass, no items before pos need be 
compared in the current pass.


I'm going to be quite busy for the next couple of days.  So if I don't 
respond to your next post quickly, please be patient.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Efficiency and speed

2010-03-20 Thread Dave Angel
(Please don't top-post.  It ruins the context for anyone else trying to 
follow it.  Post your remarks at the end, or immediately after whatever 
you're commenting on.)


James Reynolds wrote:

Here's another idea I had. I thought this would be slower than then the
previous algorithm because it has another for loop and another while loop. I
read that the overhead of such loops is high, so I have been trying to avoid
using them where possible.

def mcrange_gen(self, sample):
nx2 = self.nx1
for q in sample:
for a in nx2:
while a > q:
 pass
yield a
break


On Fri, Mar 19, 2010 at 3:15 PM, Alan Gauld wrote:

  

While loops and for loops are not slow, it's the algorithm that you're 
using that's slow. If a while loop is the best way to do the best 
algorithm, then it's fast.  Anyway, in addition to for and while, other 
"slow" approaches are find() and "in".


But slowest of all is a loop that never terminates, like the while loop 
in this example.  And once you fix that, the break is another problem, 
since it means you'll never do more than one value from sample.



In your original example, you seemed to be calling a bunch of methods 
that are each probably a single python statement.  I didn't respond to 
those, because I couldn't really figure what you were trying to do with 
them.  But now I'll comment in general terms.


Perhaps you should do something like:

zip together the original list with a range list, so you now have a list 
of tuples.  Then sort that new list.  Now loop through that sorted list 
of tuples, and loop up your bucket for each item.   That should be fast 
because  they're in order, and  you have the index to the original 
value, so you can store the bucket number somewhere useful.


HTH,
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python magazine

2010-03-27 Thread Dave Angel



Lowell Tackett wrote:
>From the virtual desk of Lowell Tackett  



--- On Fri, 3/26/10, Benno Lang  wrote:

From: Benno Lang 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: tutor@python.org, "Bala subramanian" 
Date: Friday, March 26, 2010, 8:38 PM

On 27 March 2010 00:33, Lowell Tackett  wrote:
  

The Python Magazine people have now got a Twitter site--which includes a 
perhaps [telling] misspelling.


Obviously that's why they're looking for a chief editor - maybe it's
even a deliberate ploy.

I'm not sure if this affects others, but to me your replies appear
inside the quoted section of your mail, rather than beneath it. Would
you mind writing plain text emails to avoid this issue?

Thanks,
benno

Like this...?


  
No, there's still a problem.  You'll notice in this message that there 
are ">" symbols in front of your lines and benno's, and ">>" symbols in 
front of Lowell's.  (Some email readers will turn the > into vertical 
bar, but the effect is the same).  Your email program should be adding 
those upon a reply, so that your own message has one less > than the one 
to which you're replying.  Then everyone reading can see who wrote what, 
based on how many ">" or bars precede the respective lines.  Quotes from 
older messages have more of them.


Are you using "Reply-All" in your email program?  Or are you 
constructing a new message with copy/paste?


What email are you using?  Maybe it's a configuration setting somebody 
could help with.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] inter-module global variable

2010-03-28 Thread Dave Angel

spir # wrote:

Hello,

I have a main module importing other modules and defining a top-level variable, 
call it 'w' [1]. I naively thought that the code from an imported module, when 
called from main, would know about w, but I have name errors. The initial trial 
looks as follows (this is just a sketch, the original is too big and 
complicated):

# imported "code" module
__all__ ="NameLookup", "Literal", "Assignment", ...]

# main module
from parser import parser
from code import *
from scope import Scope, World
w = World()

This pattern failed as said above. So, I tried to "export" w:

# imported "code" module
__all__ ="NameLookup", "Literal", "Assignment", ...]

# main module
from parser import parser
from scope import Scope, World
w = World()
import code #new
code.w = w  ### "export"
from code import *

And this works. I had the impression that the alteration of the "code" module object would not 
propagate to objects imported from "code". But it works. But I find this terribly unclear, fragile, 
and dangerous, for any reason. (I find this "dark", in fact ;-)
Would someone try to explain what actually happens in such case?
Also, why is a global variable not actually global, but in fact only "locally" 
global (at the module level)?
It's the first time I meet such an issue. What's wrong in my design to raise 
such a problem, if any?

My view is a follow: From the transparency point of view (like for function transparency), the 
classes in "code" should _receive_ as general parameter a pointer to 'w', before they do 
anything. In other words, the whole "code" module is like a python code chunk 
parameterized with w. If it would be a program, it would get w as command-line parameter, or from 
the user, or from a config file.
Then, all instanciations should be done using this pointer to w. Meaning, as a 
consequence, all code objects should hold a reference to 'w'. This could be 
made as follows:

# main module
import code
code.Code.w =
from code import *

# "code" module
class Code(object):
w =None ### to be exported from importing module
def __init__(self, w=Code.w):
# the param allows having a different w eg for testing
self.w =
# for each kind of code things
class CodeThing(Code):
def __init__(self, args):
Code.__init__(self)
... use args ...
   def do(self, args):
   ... use args and self.w ...

But the '###' line looks like  an ugly trick to me. (Not the fact that it's a 
class attribute; as a contrary, I often use them eg for config, and find them a 
nice tool for clarity.) The issue is that Code.w has to be exported.
Also, this scheme is heavy (all these pointers in every living object.) 
Actually, code objects could read Code.w directly but this does not change much 
(and I lose transparency).
It's hard for me to be lucid on this topic. Is there a pythonic way?


Denis

[1] The app is a kind of interpreter for a custom language. Imported modules 
define classes for  objects representing elements of code (literal, assignment, 
...). Such objects are instanciated from parse tree nodes (conceptually, they 
*are* code nodes). 'w' is a kind of global scope -- say the state of the 
running program. Indeed, most code objects need to read/write in w.
Any comments on this model welcome. I have few knowledge on implementation of 
languages.


vit esse estrany ☣

spir.wikidot.com

  
The word 'global' is indeed unfortunate for those coming to python from 
other languages. In Python, it does just mean global to a single module. 
If code in other modules needs to access your 'global variable' they 
need normally need it to be passed to them.


If you really need a program-global value, then create a new module just 
for the purpose, and define it there. Your main program can initialize 
it, other modules can access it in the usual way, and everybody's happy. 
In general, you want import and initialization to happen in a 
non-recursive way. So an imported module should not look back at you for 
values. If you want it to know about a value, pass it, or assign it for 
them.


But Python does not have pointers. And you're using pointer terminology. 
Without specifying the type of w, you give us no clue whether you're 
setting yourself up for failure. For example, the first time somebody 
does a w= newvalue they have broken the connection with other module's w 
variable. If the object is mutable (such as a list), and somebody 
changes it by using w.append() or w[4] = newvalue, then no problem.


You have defined a class attribute w, and an instance attribute w, and a 
module variable w in your main script. Do these values all want to stay 
in synch as you change values? Or is it a constant that's just set up 
once? Or some combination, where existing objects want the original 
value, but new ones created after you change it will themselves get the 
value at the time of creation? You can get any of these b

Re: [Tutor] python magazine

2010-03-28 Thread Dave Angel

Lowell Tackett wrote:
>From the virtual desk of Lowell Tackett  




--- On Sat, 3/27/10, Dave Angel  wrote:

  

From: Dave Angel 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: "Benno Lang" , tutor@python.org
Date: Saturday, March 27, 2010, 6:12 AM


Lowell Tackett wrote:

>From the virtual desk of Lowell Tackett  


--- On Fri, 3/26/10, Benno Lang 
  

wrote:


From: Benno Lang 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: tutor@python.org,
  

"Bala subramanian" 


Date: Friday, March 26, 2010, 8:38 PM

On 27 March 2010 00:33, Lowell Tackett 
  

wrote:

  
  

The Python Magazine people have now got a Twitter


site--which includes a perhaps [telling] misspelling.




Obviously that's why they're looking for a chief
  

editor - maybe it's


even a deliberate ploy.

I'm not sure if this affects others, but to me your
  

replies appear


inside the quoted section of your mail, rather than
  

beneath it. Would


you mind writing plain text emails to avoid this
  

issue?


Thanks,
benno

Like this...?


  
  

No, there's still a problem.  You'll notice in this
message that there are ">" symbols in front of your lines
and benno's, and ">>" symbols in front of
Lowell's.  (Some email readers will turn the > into
vertical bar, but the effect is the same).  Your email
program should be adding those upon a reply, so that your
own message has one less > than the one to which you're
replying.  Then everyone reading can see who wrote
what, based on how many ">" or bars precede the
respective lines.  Quotes from older messages have more
of them.

Are you using "Reply-All" in your email program?  Or
are you constructing a new message with copy/paste?

What email are you using?  Maybe it's a configuration
setting somebody could help with.

DaveA




Don't really know what I'm doing wrong (or right).  Just using the [email] tools that 
have been made available to me thru Yahoo mail and Firefox.  I began this text below your 
submission and "signature", and I'm using plain text, as suggested by a 
previous comment.  Don't know what else I could embellish this effort with.


  
This time it worked great.  You can see my comments at outermost level, 
with yours indented by one, and my previous one indented two, etc.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] inter-module global variable

2010-03-28 Thread Dave Angel

spir # wrote:

On Sun, 28 Mar 2010 21:50:46 +1100
Steven D'Aprano  wrote:

  

On Sun, 28 Mar 2010 08:31:57 pm spir ☣ wrote:


I'm going to assume you really want a single global value, and that you 
won't regret that assumption later.


We talked at length about how to access that global from everywhere that 
cares, and my favorite way is with a globals module. And it should be 
assigned something like:


globals.py:
class SomeClass (object):
def

def init(parameters):
global world
world = SomeClass(parameters, moreparamaters)

Then main can do the following:
import globals
globals.init(argv-stuff)

And other modules can then do
import globals.world as world

And they'll all see the same world variable. Nobody should have their 
own, but just import it if needed.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] commands

2010-03-28 Thread Dave Angel

Shurui Liu (Aaron Liu) wrote:

# Translate wrong British words

#Create an empty file
print "\nReading characters from the file."
raw_input("Press enter then we can move on:")
text_file = open("storyBrit.txt", "r+")
whole_thing = text_file.read()
print whole_thing
raw_input("Press enter then we can move on:")
print "\nWe are gonna find out the wrong British words."
corrections = {'colour':'color', 'analyse':'analyze',
'memorise':'memorize', 'centre':'center', 'recognise':'recognize',
'honour':'honor'}
texto = whole_thing
for a in corrections:
texto = texto.replace(a, corrections[a])
print texto

# Press enter and change the wrong words
if "colour" in whole_thing:
print "The wrong word is 'colour' and the right word is 'color'"
if "analyse" in whole_thing:
print "the wrong word is 'analyse' and the right word is 'analyze'"
if "memorise" in whole_thing:
print "the wrong word is 'memorise' and the right word is 'memorize'"
if "centre" in whole_thing:
print "the wrong word is 'centre' and the right word is 'center'"
if "recognise" in whole_thing:
print "the wrong word is 'recognise' and the right word is 'recognize'"
if "honour" in whole_thing:
print "the wrong word is 'honour' and the right word is 'honor'"

# We are gonna save the right answer to storyAmer.txt
w = open('storyAmer.txt', 'w')
w.write('I am really glad that I took CSET 1100.')
w.write('\n')
w.write('We get to analyse all sorts of real-world problems.\n')
w.write('\n')
w.write('We also have to memorize some programming language syntax.')
w.write('\n')
w.write('But, the center of our focus is game programming and it is fun.')
w.write('\n')
w.write('Our instructor adds color to his lectures that make them interesting.')
w.write('\n')
w.write('It is an honor to be part of this class!')
w = open("assign19/storyAmer.txt", "w")

w.close()



This is what I have done, I don't understand why this program cannot
fix "analyse".

  
You do some work in texto, and never write it to the output file.  
Instead you write stuff you hard-coded in literal strings in your program.


And you never fixed the mode field of the first open() function, as 
someone hinted at you.   And you don't specify the output file location, 
but just assume it's in the current directory.  For that matter, your 
open of the input file assumes it's in the current directory as well.  
But your assignment specified where both files would/should be.



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] what's wrong in my command?

2010-04-01 Thread Dave Angel

Shurui Liu (Aaron Liu) wrote:

# geek_translator3.py

# Pickle
import pickle

  
This is where you told it to load import.py.   Normally, that just 
quietly loads the standard module included with your system.



When I run it, the system gave me the feedback below:
Traceback (most recent call last):
  File "geek_translator3.py", line 4, in 
import pickle
  File "/usr/local/lib/python2.5/pickle.py", line 13, in 

AttributeError: 'module' object has no attribute 'dump'

I don't understand, I don't write anything about pickle.py, why it mentioned?
what's wrong with "import pickle"? I read many examples online whose
has "import pickle", they all run very well.
Thank you!

  
I don't have 2.5 any more, so I can't look at the same file you 
presumably have.  And line numbers will most likely be different in 
2.6.  In particular, there are lots of module comments at the beginning 
of my version of pickle.py.  You should take a look at yours, and see 
what's in line 13.   My guess it's a reference to the dump() function 
which may be defined in the same file.  Perhaps in 2.5 it was defined 
elsewhere.


Most common cause for something like this would be that pickle imports 
some module, and you have a module by that name in your current 
directory (or elsewhere on the sys.path).  So pickle gets an error after 
importing it, trying to use a global attribute that's not there.


Wild guess - do you have a file called marshal.py in your own code?

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] what's wrong in my command?

2010-04-01 Thread Dave Angel

Shurui Liu (Aaron Liu) wrote:

OK, can you tell me import.py is empty or not? If it's not an empty
document, what's its content?

  
(Please don't top-post,  Add your comments after what you're quoting, or 
at the end)


That was a typo in my message.  I should have said  pickle.py,  not 
import.py.  When you import pickle, you're tell it to find and load 
pickle.py.  That's python source code, and it will generally import 
other modules.  I was suspecting module.py.  But you should start by 
looking at line 13 of pickle.py



On Thu, Apr 1, 2010 at 5:45 AM, Dave Angel  wrote:
  

Shurui Liu (Aaron Liu) wrote:


# geek_translator3.py

# Pickle
import pickle


  

This is where you told it to load import.py.   Normally, that just quietly
loads the standard module included with your system.




When I run it, the system gave me the feedback below:
Traceback (most recent call last):
 File "geek_translator3.py", line 4, in 
   import pickle
 File "/usr/local/lib/python2.5/pickle.py", line 13, in 

AttributeError: 'module' object has no attribute 'dump'

I don't understand, I don't write anything about pickle.py, why it
mentioned?
what's wrong with "import pickle"? I read many examples online whose
has "import pickle", they all run very well.
Thank you!


  

I don't have 2.5 any more, so I can't look at the same file you presumably
have.  And line numbers will most likely be different in 2.6.  In
particular, there are lots of module comments at the beginning of my version
of pickle.py.  You should take a look at yours, and see what's in line 13.
My guess it's a reference to the dump() function which may be defined in the
same file.  Perhaps in 2.5 it was defined elsewhere.

Most common cause for something like this would be that pickle imports some
module, and you have a module by that name in your current directory (or
elsewhere on the sys.path).  So pickle gets an error after importing it,
trying to use a global attribute that's not there.

Wild guess - do you have a file called marshal.py in your own code?

DaveA







  


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] constructor

2010-04-04 Thread Dave Angel



Shurui Liu (Aaron Liu) wrote:

I am studying about how to create a constructor in a Python program, I
don't really understand why the program print out "A new critter has
been born!" and "Hi.  I'm an instance of class Critter." twice. I
guess is because "crit1 = Critter() crit2 = Critter()"  But I
don't understand how did computer understand the difference between
crit1 and crit2? cause both of them are equal to Critter(). Thank you!

# Constructor Critter
# Demonstrates constructors

class Critter(object):
"""A virtual pet"""
def __init__(self):
print "A new critter has been born!"

def talk(self):
print "\nHi.  I'm an instance of class Critter."

# main
crit1 = Critter()
crit2 = Critter()

crit1.talk()
crit2.talk()

raw_input("\n\nPress the enter key to exit.")


  

Critter is a class, not a function.  So the syntax
  crit1 = Critter()

is not calling a "Critter" function but constructing an instance of the 
Critter class.  You can tell that by doing something like

 print crit1
 print crit2

Notice that although both objects have the same type (or class), they 
have different ID values.


Since you supply an __init__() method in the class, that's called during 
construction of each object.  So you see that it executes twice.


Classes start to get interesting once you have instance attributes, so 
that each instance has its own "personality."  You can add attributes 
after the fact, or you can define them in __init__().  Simplest example 
could be:


crit1.name = "Spot"
crit2.name = "Fido"

Then you can do something like
 print crit1.name
 print crit2.name

and you'll see they really are different.

DaveA



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Matching zipcode in address file

2010-04-04 Thread Dave Angel

Alan Gauld wrote:


"TGW"  wrote


I go the program functioning with
lines = [line for line in infile if line[149:154] not in match_zips]

But this matches records that do NOT match zipcodes. How do I get 
this  running so that it matches zips?



Take out the word 'not' from the comprehension?

That's one change.  But more fundamental is to change the file I/O.  
Since there's no seek() operation, the file continues wherever it left 
off the previous time.


I'd suggest reading the data from the match_zips into a list, and if the 
format isn't correct, doing some post-processing on it.  But there's no 
way to advise on that since we weren't given the format of either file.


zipdata = match_zips.readlines()
Then you can do an  if XXX in zipdata with assurance.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Extracting lines in a file

2010-04-06 Thread Dave Angel

ranjan das wrote:

Hi,


I am new to python, and specially to file handling.

I need to write a program which reads a unique string in a file and
corresponding to the unique string, extracts/reads the n-th line (from the
line in which the unique string occurs).

I say 'n-th line' as I seek a generalized way of doing it.

For instance lets say the unique string is "documentation" (and
"documentation" occurs more than once in the file). Now, on each instance
that the string "documentation" occurs in the file,  I want to read the 25th
line (from the line in which the string "documentation" occurs)

Is there a goto kind of function in python?

Kindly help

  
You can randomly access within an open file with the seek() function.  
However, if the lines are variable length, somebody would have to keep 
track of where each one begins, which is rather a pain.  Possibly worse, 
on Windows, if you've opened the file in text mode, you can't just count 
the characters you get, since 0d0a is converted to 0a before you get 
it.  You can still do it with a combination of seek() and tell(), however.


Three easier possibilities, if any of them applies:

1) If the lines are fixed in size, then just randomly access using 
seek() before the read.


2) If the file isn't terribly big, read it into a list with readlines(), 
and randomly access the list.


3) If the file is organized in "records" (groups of lines), then read 
and process a record at a time, rather than a line at a time.  A record 
might be 30 lines, and if you found something on the first line of the 
record, you want to modify the 26th line (that's your +25).  Anyway, 
it's possible to make a wrapper around file so that you can iterate 
through records, rather than lines.


HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Sequences of letter

2010-04-12 Thread Dave Angel

Or more readably:

from string import lowercase as letters
for c1 in letters:
for c2 in letters:
for c3 in letters:
   print c1+c2+c3


Yashwin Kanchan wrote:

Hi Juan

Hope you have got the correct picture now...

I just wanted to show you another way of doing the above thing in just 4
lines.

for i in range(65,91):
for j in range(65,91):
for k in range(65,91):
print chr(i)+chr(j)+chr(k),


On 12 April 2010 06:12, Juan Jose Del Toro  wrote:

  

Dear List;

I have embarked myself into learning Python, I have no programming
background other than some Shell scripts and modifying some programs in
Basic and PHP, but now I want to be able to program.

I have been reading Alan Gauld's Tutor which has been very useful and I've
also been watching Bucky Roberts (thenewboston) videos on youtube (I get
lost there quite often but have also been helpful).

So I started with an exercise to do sequences of letters, I wan to write a
program that could print out the suquence of letters from "aaa" all the way
to "zzz"  like this:
aaa
aab
aac
...
zzx
zzy
zzz

So far this is what I have:
letras =
["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","x","y","z"]
letra1 = 0
letra2 = 0
letra3 = 0
for i in letras:
for j in letras:
for k in letras:
print letras[letra1]+letras[letra2]+letras[letra3]
letra3=letra3+1
letra2=letra2+1
letra1=letra1+1

It goes all the way to aaz and then it gives me this error
Traceback (most recent call last):
 File "/home/administrador/programacion/python/letras2.py", line 8, in

print letras[letra1]+letras[letra2]+letras[letra3]
IndexError: list index out of range
Script terminated.

Am I even in the right path?
I guess I should look over creating a function or something like that
because when I run it I can't even use my computer no memory left

--
¡Saludos! / Greetings!
Juan José Del Toro M.
jdeltoro1...@gmail.com
Guadalajara, Jalisco MEXICO


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor





  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Move all files to top-level directory

2010-04-12 Thread Dave Angel

Dotan Cohen wrote:

On 12 April 2010 20:12, Sander Sweers  wrote:
  

On 12 April 2010 18:28, Dotan Cohen  wrote:


However, it fails like this:
$ ./moveUp.py
Traceback (most recent call last):
 File "./moveUp.py", line 8, in 
   os.rename(f, currentDir)
TypeError: coercing to Unicode: need string or buffer, tuple found
  

os.rename needs the oldname and the new name of the file. os.walk
returns a tuple with 3 values and it errors out.




I see, thanks. So I was sending it four values apparently. I did not
understand the error message.

  
No, you're sending it two values:  a tuple, and a string.  It wants two 
strings.  Thus the error. If you had sent it four values, you'd have 
gotten a different error.


Actually, I will add a check that cwd !=HOME || $HOME/.bin as those
are the only likely places it might run by accident. Or maybe I'll
wrap it in Qt and add a confirm button.


  

os.walk returns you a tuple with the following values:
(the root folder, the folders in the root, the files in the root folder).

You can use tuple unpacking to split each one in separate values for
your loop. Like:

for root, folder, files in os.walk('your path):
  #do stuff




I did see that while googling, but did not understand it. Nice!


  

Judging from your next message, you still don't understand it.

It might be wise to only have this module print what it would do
instead of doing the actual move/rename so you can work out the bugs
first before it destroys your data.




I am testing on fake data, naturally.

  
Is your entire file system fake?  Perhaps you're running in a VM, and 
don't mind trashing it.


While debugging, you're much better off using prints than really moving 
files around.  You might be amazed how much damage a couple of minor 
bugs could cause.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Move all files to top-level directory

2010-04-12 Thread Dave Angel



Dotan Cohen wrote:

All right, I have gotten quite a bit closer, but Python is now
complaining about the directory not being empty:

✈dcl:test$ cat moveUp.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
currentDir =s.getcwd()

filesList =s.walk(currentDir)
for root, folder, file in filesList:
  

Why is the print below commented out?

for f in file:
toMove =oot + "/" + f
#print toMove
os.rename(toMove, currentDir)

✈dcl:test$ ./moveUp.py
Traceback (most recent call last):
  File "./moveUp.py", line 11, in 
os.rename(toMove, currentDir)
OSError: [Errno 39] Directory not empty


I am aware that the directory is not empty, nor should it be! How can
I override this?

Thanks!

  
Have you looked at the value of "currentDir" ? Is it in a form that's 
acceptible to os.rename() ? And how about toMove? Perhaps it has two 
slashes in a row in it. When combining directory paths, it's generally 
safer to use


os.path.join()

Next, you make no check whether "root" is the same as "currentDir". So 
if there are any files already in the top-level directory, you're trying 
to rename them to themselves.


I would also point out that your variable names are very confusing. 
"file" is a list of files, so why isn't it plural? Likewise "folders."


DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Move all files to top-level directory

2010-04-13 Thread Dave Angel

Dotan Cohen wrote:

Here is the revised version:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
currentDir = os.getcwd()
i = 1
filesList = os.walk(currentDir)
for rootDirs, folders, files in filesList:
  
Actual the first item in the tuple (returned by os.walk) is singular (a 
string), so I might call it rootDir.  Only the other two needed to be 
changed to plural to indicate that they were lists.

for f in files:
if (rootDirs!=currentDir):
toMove  = os.path.join(rootDirs, f)
print "--- "+str(i)
print toMove
newFilename = os.path.join(currentDir,f)
renameNumber = 1
while(os.path.exists(newFilename)):
print "- "+newFilename
newFilename = os.path.join(currentDir,f)+"_"+str(renameNumber)
renameNumber = renameNumber+1
print newFilename
i=i+1
os.rename(toMove, newFilename)

Now, features to add:
1) Remove empty directories. I think that os.removedirs will work here.
2) Prevent race conditions by performing the filename check during
write. For that I need to find a function that fails to write when the
file exists.
3) Confirmation button to prevent accidental runs in $HOME for
instance. Maybe add some other sanity checks. If anybody is still
reading, I would love to know what sanity checks would be wise to
perform.

Again, thanks to all who have helped.


  
Note that it's not just race conditions that can cause collisions.  You 
might have the same name in two distinct subdirectories, so they'll end 
up in the same place.  Which one wins depends on the OS you're running, 
I believe.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Loop comparison

2010-04-16 Thread Dave Angel



Christian Witts wrote:

Ark wrote:

Hi everyone.
A friend of mine suggested me to do the next experiment in python and 
Java.


It's a simple program to sum all the numbers from 0 to 10.

result = i = 0
while i < 10:
result += i
i += 1
print result

The time for this calculations was huge.  It took a long time to give
the result.  But, the corresponding program in Java takes less than 1
second to end.  And if in Java, we make a simple type check per cycle,
it does not take more than 10 seconds in the same machine.  I was not
expecting Python to be faster than Java, but it''s too slow.  Maybe
Java optimizes this case and Python doesn't.  Not sure about this.}

Thanks
ark
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

  
Different methods and their relative benchmarks.  The last two 
functions are shortcuts for what you are trying to do, the last 
function 't5' corrects the mis-calculation 't4' has with odd numbers.
Remember, if you know a better way to do something you can always 
optimize yourself ;)


>>> def t1(upper_bounds):
...   start = time.time()
...   total = sum((x for x in xrange(upper_bounds)))
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t1(10)
Time taken: 213.830082178
45
>>> def t2(upper_bounds):
...   total = 0
...   start = time.time()
...   for x in xrange(upper_bounds):
... total += x
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t2(10)
Time taken: 171.760597944
45
>>> def t3(upper_bounds):
...   start = time.time()
...   total = sum(xrange(upper_bounds))
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t3(10)
Time taken: 133.12481904
45
>>> def t4(upper_bounds):
...   start = time.time()
...   mid = upper_bounds / 2
...   total = mid * upper_bounds - mid
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t4(10)
Time taken: 1.4066696167e-05
45
>>> def t5(upper_bounds):
...   start = time.time()
...   mid = upper_bounds / 2
...   if upper_bounds % 2:
... total = mid * upper_bounds
...   else:
... total = mid * upper_bounds - mid
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t5(10)
Time taken: 7.15255737305e-06
45
>>> t3(1999)
Time taken: 0.003816121
1997001
>>> t4(1999)
Time taken: 3.09944152832e-06
1996002
>>> t5(1999)
Time taken: 3.09944152832e-06
1997001


A simpler formula is simply
   upper_bounds * (upper_bounds-1) / 2

No check needed for even/odd.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] QUESTION REGARDING STATUS OF MY SUBSCRIPTION FW: Auto-response for your message to the "Tutor" mailing list

2010-04-16 Thread Dave Angel

Peter Meagher wrote:

GREETINGS,

THIS EMAIL WOULD INDICATE THAT I AM ON THE SUBSCRIPTION
LIST.

HOWEVER, I GOT ANOTHER EMAIL, THAT CAME IN AT PRECISELY THE
SAME TIME AS THE ORIGINAL MESSAGE THAT I AM FORWARDING YOU.
THAT INDICATES THAT THERE WAS AN ISSUE ADDING ME TO THE
LIST. I'VE PASTED IT IN THE BLOCK OF TEXT BELOW, BUT ABOVE
THE EMAIL THAT I AM FORWARDING YOU.

THANK YOU FOR YOUR ATTENTION.
Peter Meagher



If you do not wish to be subscribed to this list, please

simply
disregard this message.  If you think you are being
maliciously
subscribed to the list, or have any other questions,
send
them to
tutor-ow...@python.org.


  
That explains where you go with subscription questions.  The address is 
NOT the same as the one used for posting on the list.  I suspect you 
didn't correctly reply to the original message.


Your other message is an independent point.  It has nothing to do with 
whether you're subscribed or not, but simply is an acknowledgement that 
you're a new poster to the list, and includes some suggestions.  In 
fact, I get that message sometimes, even though I was 3rd highest poster 
here last year.  It's perfectly legal to post without being a 
subscriber, as you could be browing the messages online.


BTW, all upper-case is considered shouting.  It makes a message much 
harder to read, and more likely to be ignored.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Loop comparison

2010-04-16 Thread Dave Angel

ALAN GAULD wrote:
  
The precalculation optimisations are 
taking place.  If you pass it an argument to use for the upper limit of the 
sequence the calculation time shoots up.



I'm still confused about when the addition takes place. 
Surely the compiler has to do the addition, so it should be slower?

I assume you have to run the posted code through cython
prior to running it in Python?

You can probably tell that I've never used Cython! :-)

Alan G.

  
I've never used Cython either, but I'd guess that it's the C compiler 
doing the extreme optimizing.  If all the code, including the loop 
parameters, are local, non-volatile, and known at compile time, the 
compile could do the arithmetic at compile time, and just store a result 
likeres = 42;


Or it could notice that there's no I/O done, so that the program has 
null effect.  And optimize the whole thing into a "sys.exit()"


I don't know if any compiler does that level of optimizing, but it's 
certainly a possibility.  And such optimizations might not be legitimate 
in stock Python (without type declarations and other assumptions), 
because of the possibility of other code changing the type of globals, 
or overriding various special functions.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Loop comparison

2010-04-17 Thread Dave Angel

Alan Gauld wrote:

"Lie Ryan"  wrote

A friend of mine suggested me to do the next experiment in python 
and Java.

It's a simple program to sum all the numbers from 0 to 10.

result = i = 0
while i < 10:
result += i
i += 1
print result



Are you sure you're not causing Java to overflow here? In Java,
Arithmetic Overflow do not cause an Exception, your int will simply wrap
to the negative side.


Thats why I asked if he got a float number back.
I never thought of it just wrapping, I assumed it would convert to 
floats.


Now that would be truly amusing.
If Java gives you the wrong answer much faster than Python gives the 
right one, which is best in that scenario?! :-)


Alan G.

It's been years, but I believe Java ints are 64 bits, on a 32bit 
implementation.  Just like Java strings are all unicode.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python root.

2010-04-18 Thread Dave Angel

Aidas wrote:

Hello.
In here 
http://mail.python.org/pipermail/tutor/2001-February/003385.html You 
had written how to ger root in python. The way is: "from math import 
sqrtprint sqrt( 49 )".


I noticed that if I write just "print sqrt(49)" I get nothing. 

I don't get "nothing," I get an error message.  In particular I get:

Traceback (most recent call last):
File "", line 1, in 
NameError: name 'sqrt' is not defined

So why I need to write "from math import sqrt" instead of write just 
"print sqrt( 49 )"?


P.S. Sorry about english-I'm lithuanian. :)

As the message says, "sqrt" is not defined in the language.  It's 
included in one of the library modules.  Whenever you need code from an 
external module, whether that module is part of the standard Python 
library or something you wrote, or even a third-party library, you have 
to import it before you can use it.  The default method of importing is:


import math
print math.sqrt(49)

Where the prefix qualifer on sqrt means to run the sqrt() specifically 
from the math module.


When a single function from a particular library module is needed many 
times, it's frequently useful to use the alternate import form:


from math import sqrt

which does two things:

import math
sqrt = math.sqrt

The second line basically gives you an alias, or short name, for the 
function from that module.


HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] the binary math "wall"

2010-04-20 Thread Dave Angel

Lowell Tackett wrote:

I'm running headlong into the dilemma of binary math representation, with 
game-ending consequences, e.g.:

  

0.15


0.14999

Obviously, any attempts to manipulate this value, under the misguided assumption that it 
is truly "0.15" are ill-advised, with inevitable bad results.

the particular problem I'm attempting to corral is thus:

  

math.modf(18.15)


(0.14858, 18.0)

with some intermediate scrunching, the above snippet morphs to:

  

(math.modf(math.modf(18.15)[0]*100)[0])/.6


1.4298

The last line should be zero, and needs to be for me to continue this algorithm.

Any of Python's help-aids that I apply to sort things out, such as formatting (%), or modules like 
"decimal" do nothing more than "powder up" the display for visual consumption (turning it 
into a string).  The underlying float value remains "corrupted", and any attempt to continue with 
the math adapts and re-incorporates the corruption.

What I'm shooting for, by the way, is an algorithm that converts a deg/min/sec 
formatted number to decimal degrees.  It [mostly] worked, until I stumbled upon 
the peculiar cases of 15 minutes and/or 45 minutes, which exposed the flaw.

What to do?  I dunno.  I'm throwing up my hands, and appealing to the "Council".

(As an [unconnected] aside, I have submitted this query as best I know how, using plain text and 
the "tu...@..." address.  There is something that either I, or my yahoo.com mailer *or 
both* doesn't quite "get" about these mailings.  But, I simply do my best, following 
advice I've been offered via this forum.  Hope this --mostly-- works.)

>From the virtual desk of Lowell Tackett  



  
One of the cases you mention is 1.666The decimal package won't 
help that at all.  What the decimal package does for you is two-fold:

   1) it means that what displays is exactly what's there
   2) it means that errors happen in the same places where someone 
doing it "by hand" will encounter.


But if you literally have to support arbitrary rational values 
(denominators other than 2 or 5), you would need to do fractions, either 
by explicitly keeping sets of ints, or by using a fractions library.  
And if you have to support arbitrary arithmetic, there's no answer other 
than hard analysis.


This is not a Python-specific problem.  Floating point has had such 
issues in every language I've dealt with since 1967, when I first 
learned Fortran.  If you compare two values, the simplest mechanism is

   abs(a-b) < delta

where you have to be clever about what small value to use for delta.

If all values are made up of  degrees/minutes/seconds, and seconds is a 
whole number, then store values as num-seconds, and do all arithmetic on 
those values.  Only convert them back to deg/min/sec upon output.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel



Lowell Tackett wrote:

--- On Tue, 4/20/10, Steven D'Aprano  wrote:

  

From: Steven D'Aprano 




The simplest, roughest way to fix these sorts of problems
(at the risk 
of creating *other* problems!) is to hit them with a

hammer:



round(18.15*100) == 1815
  

True



Interestingly, this is the [above] result when I tried entered the same snippet:

Python 2.5.1 (r251:54863, Oct 14 2007, 12:51:35)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
  

round(18.15)*100 == 1815


False
  



But you typed it differently than Steven.  He had   round(18.15*100), 
and you used round(18.15)*100


Very different.   His point boils down to comparing integers, and when 
you have dubious values, round them to an integer before comparing.  I 
have my doubts, since in this case it would lead to bigger sloppiness 
than necessary.


round(18.154 *100) == 1815

probably isn't what you'd want.

So let me ask again, are all angles a whole number of seconds?  Or can 
you make some assumption about how accurate they need to be when first 
input (like tenths of a second, or whatever)?  If so use an integer as 
follows:


val =  rounddegrees*60)+minutes)*60) + seconds)*10)

The 10 above is assuming that tenths of a second are your quantization.

HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel

Lowell Tackett wrote:
From the virtual desk of Lowell Tackett  




--- On Wed, 4/21/10, Dave Angel  wrote:

  

From: Dave Angel 
Subject: Re: [Tutor] the binary math "wall"
To: "Lowell Tackett" 
Cc: tutor@python.org, "Steven D'Aprano" 
Date: Wednesday, April 21, 2010, 6:46 AM


Lowell Tackett wrote:


--- On Tue, 4/20/10, Steven D'Aprano 
  

wrote:

  
  

From: Steven D'Aprano 


The simplest, roughest way...hit them with a
hammer:



round(18.15*100) == 1815
  
  
   


True



...when I tried...:

Python 2.5.1 (r251:54863, Oct 14 2007, 12:51:35)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright", "credits" or "license" for
  

more information.

  
  

round(18.15)*100 == 1815



False
  

  

But you typed it differently than Steven.  He
had   round(18.15*100), and you used
round(18.15)*100



As soon as I'd posted my answer I realized this mistake.

  

Very different.   His point boils down to
comparing integers, and when you have dubious values, round
them to an integer before comparing.  I have my doubts,
since in this case it would lead to bigger sloppiness than
necessary.

round(18.154 *100) == 1815

probably isn't what you'd want.

So let me ask again, are all angles a whole number of
seconds?  Or can you make some assumption about how
accurate they need to be when first input (like tenths of a
second, or whatever)?  If so use an integer as
follows:

val =  rounddegrees*60)+minutes)*60) +
seconds)*10)

The 10 above is assuming that tenths of a second are your
quantization.

HTH
DaveA





Recalling (from a brief foray into college Chem.) that a result could not be 
displayed with precision greater than the least precise component that bore 
[the result].  So, yes, I could accept my input as the arbitrator of accuracy.

A scenario:

Calculating the coordinates of a forward station from a given base station 
would require [perhaps] the bearing (an angle from north, say) and distance 
from hither to there.  Calculating the north coordinate would set up this 
relationship, e.g.:

cos(3° 22' 49.6") x 415.9207'(Hyp) = adjacent side(North)

My first requirement, and this is the struggle I (we) are now engaged in, is to 
convert my bearing angle (3° 22' 49.6") to decimal degrees, such that I can 
assign its' proper cosine value.  Now, I am multiplying these two very refined 
values (yes, the distance really is honed down to 10,000'ths of a foot-that's normal 
in surveying data); within the bowels of the computer's blackboard scratch-pad, I 
cannot allow errors to evolve and emerge.

Were I to accumulate many of these "legs" into perhaps a 15 mile 
traverse-accumulating little computer errors along the way-the end result could be 
catastrophically wrong.

(Recall that in the great India Survey in the 1800's, Waugh got the elevation of Mt. 
Everest wrong by almost 30' feet for just this exact same reason.)  In surveying, we have 
a saying, "Measure with a micrometer, mark with chalk, cut with an axe".  
Accuracy [in math] is a sacred tenet.

So, I am setting my self very high standards of accuracy, simply because those 
are the standards imposed by the project I am adapting, and I can require 
nothing less of my finished project.

  
If you're trying to be accurate when calling cos, why are you using 
degrees?  The cosine function takes an angle in radians.  So what you 
need is a method to convert from deg/min/sec to radians.  And once you 
have to call trig, you can throw out all the other nonsense about 
getting exact values.  Trig functions don't take arbitrary number 
units.  They don't take decimals, and they don't take fractions.  They 
take double-precision floats.


Perhaps you don't realize the amount of this quantization error we've 
been talking about.  The double type is 64bits in size, and contains the 
equivalent of about 18 decimal digits of precision.  (Assuming common 
modern architectures, of course)



Your angle is specified to about 5 digits of precision, and the distance 
to 7.  So it would take a VERY large number of typical calculations for 
errors in the 18th place to accumulate far enough to affect those.


The real problem, and one that we can't solve for you, and neither can 
Python, is that it's easy to do calculations starting with 8 digits of 
accuracy, and the result be only useful to 3 or 4.  For example, simply 
subtract two very close numbers, and use the result as though it were 
meaningful.


I once had a real customer send us a letter asking about the math 
precision of a calculation he was doing.  I had written the math 
microcode of the machine he was using (from add and subtract, up to 
trigs and logs, I

Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel

Steven D'Aprano wrote:

On Thu, 22 Apr 2010 01:37:35 am Lowell Tackett wrote:



Were I to accumulate many of these "legs" into perhaps a 15 mile
traverse-accumulating little computer errors along the way-the end
result could be catastrophically wrong.



YES!!! 

And just by being aware of this potential problem, you are better off 
than 90% of programmers who are blithely unaware that floats are not 
real numbers.



  
Absolutely.  But "catastrophically wrong" has to be defined, and 
analyzed.  If each of these measurements is of 100 feet, measured to an 
accuracy of .0001 feet, and you add up the measurements in Python 
floats, you'll be adding 750 measurements, and your human error could 
accumulate to as much as .07 feet.


The same 750 floating point ads, each to 15 digits of quantization 
accuracy (thanks for the correction, it isn't 18) will give a maximum 
"computer error" of  maybe .1 feet.  The human error is much 
larger than the computer error.


No results can be counted on without some analysis of both sources of 
error.  Occasionally, the "computer error" will exceed the human, and 
that depends on the calculations you do on your measurements.


HTH,
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sys.path and the path order

2010-04-23 Thread Dave Angel

Garry Willgoose wrote:
My 
question is so simple I'm surprised I can't find an answer somewhere. 
I'm interested if I can rely on the order of the directories in the 
sys.path list. When I'm running a file from the comand line like


python tellusim.py

The string in entry sys.path[0] appears to be the full path to the 
location of the file I'm running in this case tellusim ... i.e. it 
looks like '/Volumes/scone2/codes/tellusim0006'. This is good because 
for my code I need to create a search path for modules that is 
relative to the location of this file irrespective of the location I'm 
in when I invoke the script file (i.e. I could be in /Volumes/scone2 
and invoke it by 'python codes/tellusim0006/tellusim.py').


The question is can I rely on entry [0] in sys.path always being the 
directory in which the original file resides (& across linux, OSX and 
Windows)? If not what is the reliable way of getting that information?



As Steven says, that's how it's documented.

There is another way, one that I like better.  Each module, including 
the startup script, has an attribute called __file__, which is the path 
to the source file of that module.


Then I'd use os.path.abspath(), and os.path.dirname() to turn that into 
an absolute path to the directory.


The only exception I know of to __file__ usefulness is modules that are 
loaded from zip files.  I don't know if the initial script can come from 
a zip file, but if it does, the question changes.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Hi everybody stuck on some error need help please thank you!!

2010-04-24 Thread Dave Angel
(Don't top-post.  Either put your remarks immediately after the part 
they reference, or at the end of the message.  Otherwise, everything's 
thoroughly out of order.)


Marco Rompré wrote:

I tried to enter model = Modele (nom_fichier) but it still does not work.
  
You didn't define the global nom_fichier till after that line.  In 
general, while you're learning, please avoid using the same names for 
global values, class attributes, instance attributes, function parameter 
names, and local variables.   The rules for what a name means changes 
depending on where the name is used.


On Fri, Apr 23, 2010 at 11:22 PM, Steven D'Aprano wrote:

  

On Sat, 24 Apr 2010 01:07:11 pm Marco Rompré wrote:



Here's my code:
  

[...]


class Modele:
"""
La definition d'un modele avec les magasins.
"""
def __init__(self, nom_fichier, magasins =[]):
self.nom_fichier = nom_fichier
self.magasins = magasins
  

[...]


if __name__ == '__main__':
modele = Modele()
  
This is where you got the error, because there's a required argument, 
for parameter nom_fichier.  So you could use

   modele = Modele("thefile.txt")

nom_fichier = "magasinmodele.txt"
  
I'd call this something else, like  g_nom_fichier.  While you're 
learning, you don't want to get confused between the multiple names that 
look the same.

modele.charger(nom_fichier)
if modele.vide():
modele.initialiser(nom_fichier)
modele.afficher()

And here's my error :

Traceback (most recent call last):
  File "F:\School\University\Session 4\Programmation
SIO\magasingolfmodele.py", line 187, in 
modele = Modele()
TypeError: __init__() takes at least 2 arguments (1 given)
  

You define Modele to require a nom_fichier argument, but then you try to
call it with no nom_fuchier.




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Binary search question

2010-04-25 Thread Dave Angel

Lie Ryan wrote:

On 04/24/10 23:39, Robert Berman wrote:
  

-Original Message-
From: tutor-bounces+bermanrl=cfl.rr@python.org [mailto:tutor-
bounces+bermanrl=cfl.rr@python.org] On Behalf Of Alan Gauld
Sent: Friday, April 23, 2010 7:41 PM
To: tutor@python.org
Subject: Re: [Tutor] Binary search question

"Emile van Sebille"  wrote

  

   BIG SNIP


And even at 1000 entries, the list creation slowed right
down - about 10 seconds, but the searches even for "-5" were
still around a second.

So 'in' looks pretty effective to me!
  

Now that is most impressive.




But that is with the assumption that comparison is very cheap. If you're
searching inside an object with more complex comparison, say, 0.01
second per comparison, then with a list of 10 000 000 items, with 'in'
you will need on *average* 5 000 000 comparisons which is 50 000 seconds
compared to *worst-case* 24 comparisons with bisect which is 0.24 seconds.

Now, I say that's 208333 times difference, most impressive indeed.


  


The ratio doesn't change with a slow comparison, only the magnitude.

And if you have ten million objects that are complex enough to take .01 
secs per comparison, chances are it took a day or two to load them up 
into your list.  Most likely you won't be using a list anyway, but a 
database, so you don't have to reload them each time you start the program.



It's easy to come up with situations in which each of these solutions is 
better than the other.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] For loop breaking string methods

2010-04-26 Thread Dave Angel

C M Caine wrote:

Thank you for the clarification, bob.

For any future readers of this thread I include this link[1] to effbot's
guide on lists, which I probably should have already read.

My intention now is to modify list contents in the following fashion:

for index, value in enumerate(L):
L[0] = some_func(value)

Is this the standard method?

[1]: http://effbot.org/zone/python-list.htm

Colin Caine

  

Almost.   You should have said

   L[index] = some_func(value)

The way you had it, it would only replace the zeroth item of the list.

Note also that if you insert or delete from the list while you're 
looping, you can get undefined results.  That's one reason it's common 
to build a new loop, and just assign it back when done.  Example would 
be the list comprehension you showed earlier.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel

Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire thing 
into memory with readlines().  And then you do confusing work to split 
it apart again, into one list element per paper.   And for a while 
there, you have three copies of the entire text.  You're keeping two 
copies, in the form of alltext and papers. 

You print out the len(papers).  What do you see there?  Is it correctly 
87 ?  If it's not, you have to fix the problem here, before even going on.


  I want to prepare a file with a row for each paper and a column for 
each term. The cells would contain the count of a term in that paper.  
In the original application in the 1950's 30 single word terms were 
used. I can now use NoteTab to get a list of all the 8708 separate 
words in allWords.txt. I can then use that data in statistical 
exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am using 
to learn PYTHON. The comments starting with "later" are things I will 
try to do to make this more useful. I am getting one step at at time 
to work


It works when the number of terms in the term list is small e.g., 10.  
I get a file with the correct number of rows (87) and count columns 
(10) in termcounts.txt. The termcounts.txt file is not correct when I 
have a larger number of terms, e.g., 100. I get a file with only 40 
rows and the correct number of columns.  With 8700 terms I get only 40 
rows I need to be able to have about 8700 terms. (If this were FORTRAN 
I would say that the subscript indices were getting scrambled.)  (As I 
develop this I would like to be open-ended with the numbers of input 
papers and open ended with the number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v 
for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a string 
to underscore // replace (ltrtim(rtrim(varname))," ","_")   replace 
any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for s in 
counts]) + "\n")



Art

If you're memory limited, you really should sequence through the files, 
only loading one at a time, rather than all at once.  It's no harder.  
Use dirlist() to make a list of files, then your loop becomes something 
like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least close 
each file when done.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel

Art Kendall wrote:



On 5/6/2010 11:14 AM, Dave Angel wrote:

Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire 
thing into memory with readlines().  And then you do confusing work 
to split it apart again, into one list element per paper.   And for a 
while there, you have three copies of the entire text.  You're 
keeping two copies, in the form of alltext and papers.
You print out the len(papers).  What do you see there?  Is it 
correctly 87 ?  If it's not, you have to fix the problem here, before 
even going on.


  I want to prepare a file with a row for each paper and a column 
for each term. The cells would contain the count of a term in that 
paper.  In the original application in the 1950's 30 single word 
terms were used. I can now use NoteTab to get a list of all the 8708 
separate words in allWords.txt. I can then use that data in 
statistical exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am 
using to learn PYTHON. The comments starting with "later" are things 
I will try to do to make this more useful. I am getting one step at 
at time to work


It works when the number of terms in the term list is small e.g., 
10.  I get a file with the correct number of rows (87) and count 
columns (10) in termcounts.txt. The termcounts.txt file is not 
correct when I have a larger number of terms, e.g., 100. I get a 
file with only 40 rows and the correct number of columns.  With 8700 
terms I get only 40 rows I need to be able to have about 8700 terms. 
(If this were FORTRAN I would say that the subscript indices were 
getting scrambled.)  (As I develop this I would like to be 
open-ended with the numbers of input papers and open ended with the 
number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v 
for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a 
string to underscore // replace (ltrtim(rtrim(varname))," ","_")   
replace any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for s 
in counts]) + "\n")



Art

If you're memory limited, you really should sequence through the 
files, only loading one at a time, rather than all at once.  It's no 
harder.  Use dirlist() to make a list of files, then your loop 
becomes something like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least 
close each file when done.


DaveA




Thank you for getting back to me. I am trying to generalize a process 
that 50 years ago used 30 terms on the whole file and I am using the 
task of generalizing the process to learn python.   In the post I sent 
there were comments to myself about things that I would want to learn 
about.  One of the first is to learn about processing all files in a 
folder, so your reply will be very helpful.  It seems that dirlist() 
should allow me to includ

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel

Art Kendall wrote:



On 5/6/2010 1:51 PM, Dave Angel wrote:

Art Kendall wrote:



On 5/6/2010 11:14 AM, Dave Angel wrote:

Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire 
thing into memory with readlines().  And then you do confusing work 
to split it apart again, into one list element per paper.   And for 
a while there, you have three copies of the entire text.  You're 
keeping two copies, in the form of alltext and papers.
You print out the len(papers).  What do you see there?  Is it 
correctly 87 ?  If it's not, you have to fix the problem here, 
before even going on.


  I want to prepare a file with a row for each paper and a column 
for each term. The cells would contain the count of a term in that 
paper.  In the original application in the 1950's 30 single word 
terms were used. I can now use NoteTab to get a list of all the 
8708 separate words in allWords.txt. I can then use that data in 
statistical exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am 
using to learn PYTHON. The comments starting with "later" are 
things I will try to do to make this more useful. I am getting one 
step at at time to work


It works when the number of terms in the term list is small e.g., 
10.  I get a file with the correct number of rows (87) and count 
columns (10) in termcounts.txt. The termcounts.txt file is not 
correct when I have a larger number of terms, e.g., 100. I get a 
file with only 40 rows and the correct number of columns.  With 
8700 terms I get only 40 rows I need to be able to have about 8700 
terms. (If this were FORTRAN I would say that the subscript 
indices were getting scrambled.)  (As I develop this I would like 
to be open-ended with the numbers of input papers and open ended 
with the number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 
'not', 'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or 
v for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a 
string to underscore // replace (ltrtim(rtrim(varname))," ","_")   
replace any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for 
s in counts]) + "\n")



Art

If you're memory limited, you really should sequence through the 
files, only loading one at a time, rather than all at once.  It's 
no harder.  Use dirlist() to make a list of files, then your loop 
becomes something like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least 
close each file when done.


DaveA




Thank you for getting back to me. I am trying to generalize a 
process that 50 years ago used 30 terms on the whole file and I am 
using the task of generalizing the process to learn python.   In the 
post I sent there were comments to myself about things that I would 
want to learn about.  One of the first is to learn about processing 
all files in a folder, so your reply will be very h

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-07 Thread Dave Angel

Art Kendall wrote:



On 5/6/2010 8:52 PM, Dave Angel wrote:




I got my own copy of the papers, at 
http://thomas.loc.gov/home/histdox/fedpaper.txt


I copied your code, and added logic to it to initialize termlist from 
the actual file.  And it does complete the output file at 83 lines, 
approx 17000 columns per line (because most counts are one digit).  
It takes quite a while, and perhaps you weren't waiting for it to 
complete.  I'd suggest either adding a print to the loop, showing the 
count, and/or adding a line that prints "done" after the loop 
terminates normally.


I watched memory usage, and as expected, it didn't get very high.  
There are things you need to redesign, however.  One is that all the 
punctuation and digits and such need to be converted to spaces.



DaveA




Thank you for going the extra mile.

I obtained my copy before I retired in 2001 and there are some 
differences.  In the current copy from the LOC papers 7, 63, and 81 
start with "FEDERALIST." (an extra period).  That explains why you 
have 83. There also some comments such as attributed author.  After 
the weekend, I'll do a file compare and see differences in more detail.


Please email me your version of the code.  I'll try it as is.  Then 
I'll put in a counter, have it print the count and paper number, and a 
'done' message.


As a check after reading in the counts, I'll include the counts from 
NoteTab and see if these counts sum to those from NoteTab.


I'll use SPSS to create a version of the .txt file with punctuation 
and numerals changed to spaces and try using that as the corpus.   
Then I'll try to create a similar file with Python.


Art

As long as you realize this is very rough.  I just wanted to prove there 
wasn't anything fundamentally wrong with your approach.  But there's 
still lots to do, especially with regards to cleaning up the text before 
and between the papers.  Anyway, here it is.


#!/usr/bin/env python

sourcedir = "data/"
outputdir = "results/"


# word counts: Federalist papers
import sys, os
import re, textwrap
#Create the output directory if it doesn't exist
if not os.path.exists(outputdir):
   os.makedirs(outputdir)

# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file(sourcedir + "feder16.txt").readlines()

filtered = " ".join(alltext).lower()
for ch in ('" ' + ". , ' * - ( ) = @ [ ] ; . ` 1 2 3 4 5 6 7 8 9 0 > : / 
?").split():

   filtered = filtered.replace(ch, " ")
#todo:   make a better filter, such as keeping only letters, rather than 
replacing

#   specific characters

words = filtered.split()
print "raw word count is", len(words)

wordset = set(words)
print "wordset reduces it from/to", len(words), len(wordset)
#eliminate words shorter than 4 characters
words = sorted([word for word in wordset if len(word)>3])
del wordset#free space of wordset
print "Eliminating words under 4 characters reduces it to", len(words)

#print the first 50
for word in words[:50]:
   print word



print "alltext is size", len(alltext)
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print "Number of detected papers is ", len(papers)

#print first 50 characters of each, so we can see why some of them are 
missed

#   by our regex above
for index, paper in enumerate(papers):
   print index, "***", paper[:50]


countsfile = file(outputdir + "TermCounts.txt", "w")
syntaxfile = file(outputdir + "TermCounts.sps", "w")
# later create a python program that extracts all words instead of using 
NoteTab

#termfile   = open("allWords.txt")
#termlist = termfile.readlines()
#termlist = [item.rstrip("\n") for item in termlist]
#print "termlist is ", len(termlist)

termlist = words

# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v for 
v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a string 
to underscore // replace (ltrtim(rtrim(varname))," ","_")   replace any 
special characters with @ in variable names



for p, paper in enumerate(papers):
  counts = []
  for t in termlist:
 counts.append(len(re.findall(r"\b" + t + r"\b", paper, 
re.IGNORECASE)))

  print p, counts[:5]
  if sum(counts) > 0:
 papernum = re.search("[0-9]+", papers[p]).group(0)
 countsfile.write(str(papernum) + " " + " ".join([str(s) for s in 
counts]) + "\n")


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] (no subject)

2010-05-11 Thread Dave Angel

Sivapathasuntha Aruliah wrote:

Hi
I am learning Python. When I tried to run any of the program for example 
csv2html1_ans.py it displays the following message. This error is coming 
on both Python24 & Python 31. That is whether i give the any one of the 
following command


COMMAND GIVEN
1.C:\python24\python.exe C:\py3eg\quadratic.py
2.C:\python31\python.exe C:\py3eg\quadratic.py

A message below appears with the program name. Please advice me how to get 
over from this issue

ERROR MESSAGE
command  C:\py3eg\csv2html1_ans.py is not a valid Win32 application

Regards,
Siva
Test Equipment Engineering
Amkor Technology (S) Pte Ltd
1 Kaki Bukit View
#03-28 TechView Building
Singapore 415941
Tel: (65) 6347 1131
Fax: (65) 6746 4815
  
Please copy and paste the actual contents of your DOS box, rather than 
paraphrasing.  COMMAND hasn't been the normal shell name since Win95 
days.  You can't use numbers in front of commands in any shell I've 
used.  The error message refers to a different file than anything you 
specified in your commands.


What OS are you using?

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Help required to count no of lines that are until 1000 characters

2010-05-11 Thread Dave Angel

ramya natarajan wrote:

Hello,

I am very  beginner to programming, I  got task to Write a loop that 
reads
each line of a file and counts the number of lines that are read until 
the

total length of the lines is 1,000 characters. I have to read lines from
files exactly upto 1000 characters.

Here is my code:
 I created file under /tmp/new.txt  which has 100 lines and 2700 
characters
, I wrote code will read exactly 1000 characters and count lines upto 
those

characters.But the problem is its reading entire line and not stopping
excatly in 1000 characters. Can some one help what mistake i am doing 
here?.


   log = open('/tmp/new.txt','r')
   lines,char = 0,0
   for line in log.readlines():
while char < 1000 :
for ch in line :
 char += len(ch)
lines += 1
  print char , lines
  1026 , 38    Its counting entire line  instead of character upto 
1000

-- can some one point out what mistake am i doing here , where its not
stopping at 1000 . I am reading only char by car

My new.txt -- cotains content like
this is my new number\n

Can some one please help. I spent hours and hours to find issue but i 
am not

able to figure out, Any help would be greatly appreciated.
Thank you
Ramya

  
The problem is ill-specified (contradictory).  It'd probably be better 
to give the exact wording of the assignment.


If you read each line of the file, then it would only be a coincidence 
if you read exactly 1000 characters, as most likely one of those lines 
will overlap the 1000 byte boundary.



But you have a serious bug in your code, that nobody in the first five 
responses has addressed.  That while loop will loop over the first line 
repeatedly, till it reaches or exceeds 1000, regardless of the length of 
subsequent lines.  So it really just divides 1000 by the length of that 
first line.  Notice that the lines += 1 will execute multiple times for 
a single iteration of the for loop.


Second, once 1000 is reached, the for loop does not quit.  So it will 
read the rest of the file, regardless of how big the file is.  It just 
stops adding to lines or char, since char reached 1000 on the first line.


The simplest change to your code which might accomplish what you want is 
to put the whole thing inside a function, and return from the function 
when the goal is reached.  So instead of a while loop, you need some 
form of if test.  See if you can run with that.  Remember that return 
can return a tuple (pair of numbers).


There are plenty of other optimizations and approaches, but you'll learn 
best by incrementally fixing what you already have.


DaveA





___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


  1   2   3   4   5   6   7   8   9   10   >