from:"Dave Angel"

Re: [Tutor] Expanding a Python script to include a zcat and awk pre-process

2010-01-09 Thread Dave Angel


galaxywatc...@gmail.com wrote:
After 
many more hours of reading and testing, I am still struggling to 
finish this simple script, which bear in mind, I already got my 
desired results by preprocessing with an awk one-liner.


I am opening a zipped file properly, so I did make some progress, but 
simply assigning num1 and num2 to the first 2 columns of the file 
remains elusive. Num3 here gets assigned, not to the 3rd column, but 
the rest of the entire file. I feel like I am missing a simple strip() 
or some other incantation that prevents the entire file from getting 
blobbed into num3. Any help is appreciated in advance.


#!/usr/bin/env python

import string
import re
import zipfile
highflag = flagcount = sum = sumtotal = 0
f = file("test.zip")
z = zipfile.ZipFile(f)
for f in z.namelist():
ranges = z.read(f)
This reads the whole file into ranges.  In your earlier incantation, you 
looped over the file, one line at a time.  So to do the equivalent, you 
want to do a split here, and one more

nesting of loops.
   lines = z.read(f).split("\n")#build a list of text lines
   for ranges in lines:#here, ranges is a single line

and of course, indent the remainder.

ranges = ranges.strip()
num1, num2, num3 = re.split('\W+', ranges, 2)  ## This line is the 
root of the problem.

sum = int(num2) - int(num1)
if sum > 1000:
flag1 = " "
flagcount += 1
else:
flag1 = ""
if sum > highflag:
highflag = sum
print str(num2) + " - " + str(num1) + " = " + str(sum) + flag1
sumtotal = sumtotal + sum

print "Total ranges = ", sumtotal
print "Total ranges over 10 million: ", flagcount
print "Largest range: ", highflag

==
$ zcat test.zip
134873600, 134873855, "32787 Protex Technologies, Inc."
135338240, 135338495, 40597
135338496, 135338751, 40993
201720832, 201721087, "12838 HFF Infrastructure & Operations"
202739456, 202739711, "1623 Beseau Regional de la Region Languedoc 
Roussillon"






___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] what is the equivalent function to strtok() in c++

2010-01-10 Thread Dave Angel


sudhir prasad wrote:

hi,
what is the equivalent function to strtok() in c++,
what i need to do is to divide a line into different strings and store them
in different lists,and write them in to another file

  
If your tokens are separated by whitespace, you can simply use a single 
call to split().  It will turn a single string into a list of tokens.


line = "Now   is the time"
print line.split()

will display the list:
['Now', 'is', 'the', 'time']

HTH
DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Keeping a list of attributes of a certain type

2010-01-14 Thread Dave Angel

(You top-posted, which puts your two comments out of order.  Now the 
solution comes before the problem statement)


Guilherme P. de Freitas wrote:

Ok, I got something that seems to work for me. Any comments are welcome.


class Member(object):
def __init__(self):
pass


class Body(object):
def __init__(self):
self.members = []

def __setattr__(self, obj, value):
if isinstance(value, Member):
self.members.append(obj)
object.__setattr__(self, obj, value)
else:
object.__setattr__(self, obj, value)

def __delattr__(self, obj):
if isinstance(getattr(self, obj), Member):
self.members.remove(obj)
object.__delattr__(self, obj)
else:
object.__delattr__(self, obj)



john = Body()
john.arm = Member()
print(john.members)
del john.arm
print(john.members)


On Wed, Jan 13, 2010 at 6:24 PM, Guilherme P. de Freitas
 wrote:
  

Hi everybody,

Here is my problem. I have two classes, 'Body' and 'Member', and some
attributes of 'Body' can be of type 'Member', but some may not. The
precise attributes that 'Body' has depend from instance to instance,
and they can be added or deleted. I need any instance of 'Body' to
keep an up-to-date list of all its attributes that belong to the class
'Member'. How do I do this?

Best,

Guilherme

--
Guilherme P. de Freitas
http://www.gpfreitas.com


If this is a class assignment, you've probably got the desired answer.  
But if it's a real-world problem, there are tradeoffs, and until those 
are known, the simplest solution is usually the best.  Usually, what's 
desired is that the object behaves "as if" it has an up-to-date list of...


If order of the list doesn't matter, I'd consider simply writing a 
single method, called 'members' which does a list comprehension of the 
object's attributes, calculating the list when needed.  Then use a 
decorator to make this method look like a read-only data attribute  
(untested):


class Body (object):
@property
 def members(self):
return  [obj for .   if  ]

DaveA



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Searching in a file

2010-01-15 Thread Dave Angel


Paul Melvin wrote:

Hi,

Thanks very much to all your suggestions, I am looking into the suggestions
of Hugo and Alan.

The file is not very big, only 700KB (~2 lines), which I think should be
fine to be loaded into memory?

I have two further questions though please, the lines are like this:


Revenge
(2011)



5 days 


65 minutes 

Etc with a chunk (between each NEW) being about 60 lines, I need to extract
info from these lines, e.g. /browse/post/5354361/ and Revenge (2011) to pass
back to the output, is re the best option to get all these various bits,
maybe a generic function that I pass the search strings too?

And if I use the split suggestion of Alan's I assume the last one would be
the rest of the file, would the next() option just let me search for the
next /browse/post/5354361/ etc after the NEW? (maybe putting this info into
a list)

  
One way to handle "the rest of the file" is to add a marker at the end 
of the data.  So if you read the whole thing with readlines(), you can 
append another "NEW" so that all matches are between one NEW and the next.

Thanks again

paul

  
If this file is valid html, or xml, then perhaps you should use one of 
the html or xml parsing tools, rather than anything so esoteric as 
regex.  In any case, it now appears that NEW won't necessarily be 
unique, so you might want to start with  'alt="NEW"'  or something like 
that.  A key question becomes whether this data was automatically 
generated, or whether it might have variations from one sample to the 
next.  (for example,  alt ="NEW"  with different spacing.  or  
ALT="NEW")  And whether it's definitely valid html, or just close.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Replacing the string in a file

2010-01-22 Thread Dave Angel




vanam wrote:

Thanks for your mail.

As you have suggested i have  changed the mode to 'rw' but it is
throwing up an error as below

***
IOError: [Errno 22] invalid mode ('rw') or filename: 'data.txt'
***
I am using python 2.6.4.

But Script is managed to pass with 'a+' mode/r+ mode.

log = open('data.txt','r+/a+')
for x in log:
 x = x.replace('Python','PYTHON')
 print x,
log.close()

It had properly written and replaced Python to PYTHON.

Thanks for your suggestion.

  

  
That won't work.  Better test it some more.  Without some form of 
write() call, you're not changing the file.



There are several workable suggestions in this thread, and I think 
fileinput is the easiest one.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] [File Input Module]Replacing string in a file

2010-01-28 Thread Dave Angel


vanam wrote:

Hi all,

As it was suggested before in the mailing list about the query
regarding replacing string in the file, i have used the module File
input for replacing the string in the file.

For understanding and execution purpose, i have just included Python
as a string in the file and want it to be replaced to PYTHON.

Below are my queries and code: (Correct me if my understanding is wrong???)

1))

import fileinput
x = fileinput.input('data.txt',inplace=0)
for line in x:
 line = line.replace('Python','PYTHON)
 print line,
x.close()

The above piece of code will not create any backup file but  it will
replace PYTHON (Print on the console) but not physically write to the
file.

2)))

import fileinput
x = fileinput.input('data.txt',inplace=1)
for line in x:
line = line.replace('Python','PYTHON')
print line,
x.close()

The above piece of code will create backup file but hidden (in the
form of bak file) and it will physically write to the file -- I have
verified the contents of data.txt after the file operation and it had
written successfully.But why it is not printing line i.e. string in
the file on the console.

  
When you use the inplace=true option, it will redirect standard output 
to the file.  So print is going there, and *not* to the console.  I 
don't know whether close() restores the original console or not.

3)))

import fileinput
x = fileinput.input('data.txt',inplace=1)
for line in x:
line = line.replace('Python','PYTHON')
x.close()

The above piece of code after execution is wiping out the full
contents. But addition of print line, is exactly replacing the string,
what exactly addition of print is making difference???

  

See above.  Since you print nothing to sys.stdout, the output file is empty.

4)))

import fileinput
x = fileinput.input('data.txt',inplace=1,backup='content.txt')
for line in x:
line = line.replace('Python','PYTHON')
print line,
x.close()

The above piece is creating a backup file by name data.txtcontent.txt
(I am not sure whether created file name is correct or not?) and to
the back up file it had added previous content i.e., Python and it had
replaced the contents in data.txt to PYTHON

5)))

Suppose if data.txt has string Python written in Font size 72 and when
i display the string on the console ie. by below piece of code

import fileinput
x = fileinput.input('data.txt',inplace=0)
for line in x:
  print line,
x.close()

It wouldnt print with the same Font size on the console (This wont
prove anything wrong as the same font could be backed with a different
file name)

  
Text files have no concept of fonts or color.  Sometimes there are extra 
annotations in a file (eg. escape sequences) which can be interpreted by 
particular software as commands to change font, or change color, or even 
to reposition.  Examples of this would be html, postscript, rich-text, 
and ANSI escape sequences.


But those escape sequences will only be meaningful to a program that 
understands them.  So if you print html files out to the console, you'll 
see lots of angle brackets and such, rather than seeing the pretty 
display intended to show in a browser.  If you print to a console, it's 
up to that console to process some escape sequences (eg. ANSI) or not.

Do let me know if my understanding is correct.
  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] hash value input

2010-01-30 Thread Dave Angel


spir wrote:

On Fri, 29 Jan 2010 08:23:37 -0800
Emile van Sebille  wrote:

  

So, how does python do this?
 
  

Start here...

http://effbot.org/zone/python-hash.htm



Great, thank you!
From the above pointed page:

For ordinary integers, the hash value is simply the integer itself (unless 
it’s -1).

class int:
def __hash__(self):
value =elf
if value =-1:
value =-2
return value

I'm surprised of this, for this should create as many indexes (in the underlying array 
actually holding the values) as there are integer keys. With possibly huge holes in the 
array. Actually, there will certainly be a predefined number of indexes N, and the 
integers be further "modulo-ed" N. Or what?
I would love to know how to sensibly chose the number of indexes. Pointers 
welcome (my searches did not bring any clues on the topic).

  

Emile



Denis


la vita e estrany

http://spir.wikidot.com/

  
I haven't seen the sources, so I'm making an educated guess based on 
things I have seen. The index size grows as the size of the dictionary 
grows, and the formula is not linear. Neither are the sizes obvious 
powers of two or suchlike. I doubt if you have any control over it, 
however. The hash() function returns an int (32 bits on 32bit python), 
which is then converted to the bucket number, probably by a simple 
modulo function.


In the case of integers, it's the modulo which distributes the integers 
among the buckets. If all the integer keys are consecutive, then modulo 
distributes them perfectly. If they're random, then it'll usually work 
pretty well, but you could hit a pattern which puts lots of values in 
one bucket, and not many in the others. If the index size is 22, and all 
your numbers are multiple of 22, then it might degenerate to effectively 
one bucket.


BTW, the referenced article does have a contradiction. For a long int 
whose value is between 16 and 31 bits, the described approach will not 
generate the same hash as the int of the same value. So that 15 bit 
shift algorithm must have some other subtlety to it, perhaps only 
starting with bit 31 or so.


DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] parse text file

2010-02-02 Thread Dave Angel


Norman Khine wrote:

thanks denis,

On Tue, Feb 2, 2010 at 9:30 AM, spir  wrote:
  

On Mon, 1 Feb 2010 16:30:02 +0100
Norman Khine  wrote:



On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson  wrote:
  

On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine  wrote:



thanks, what about the whitespace problem?
  

\s* will match any amount of whitespace includin newlines.


thank you, this worked well.

here is the code:

###
import re
file=en('producers_google_map_code.txt', 'r')
data =repr( file.read().decode('utf-8') )

block =e.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
b =lock.findall(data)
block_list =]
for html in b:
  namespace =}
  t =e.compile(r"""(.*)<\/strong>""")
  title =.findall(html)
  for item in title:
  namespace['title'] =tem
  u =e.compile(r"""a href=\"\/(.*)\">En savoir plus""")
  url =.findall(html)
  for item in url:
  namespace['url'] =tem
  g =e.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")
  lat =.findall(html)
  for item in lat:
  namespace['LatLng'] =tem
  block_list.append(namespace)

###

can this be made better?
  

The 3 regex patterns are constants: they can be put out of the loop.

You may also rename b to blocks, and find a more a more accurate name for 
block_list; eg block_records, where record =et of (named) fields.

A short desc and/or example of the overall and partial data formats can greatly 
help later review, since regex patterns alone are hard to decode.



here are the changes:

import re
file=en('producers_google_map_code.txt', 'r')
data =repr( file.read().decode('utf-8') )

get_record =e.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
get_title =e.compile(r"""(.*)<\/strong>""")
get_url =e.compile(r"""a href=\"\/(.*)\">En savoir plus""")
get_latlng =e.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")

records =et_record.findall(data)
block_record =]
for record in records:
namespace =}
titles =et_title.findall(record)
for title in titles:
namespace['title'] =itle
urls =et_url.findall(record)
for url in urls:
namespace['url'] =rl
latlngs =et_latlng.findall(record)
for latlng in latlngs:
namespace['latlng'] =atlng
block_record.append(namespace)

print block_record
  

The def of "namespace" would be clearer imo in a single line:
   namespace =title:t, url:url, lat:g}



i am not sure how this will fit into the code!

  

This also reveals a kind of name confusion, doesn't it?


Denis




Your variable 'file' is hiding a built-in name for the file type.  No 
harm in this example, but it's a bad habit to get into.


What did you intend to happen if the number of titles, urls, and latIngs 
are not each exactly one?  As you have it now, if there's more than one, 
you spend time adding them all to the dictionary, but only the last one 
survives.  And if there aren't any, you don't make an entry in the 
dictionary.


If that's the exact behavior you want, then you could replace the loop 
with an if statement:   (untested)


if titles:
namespace['title'] = titles[-1]


On the other hand, if you want a None in your dictionary for missing 
information, then something like:  (untested)


for record in records:


titles = get_title.findall(record)
title = titles[-1] if titles else None
urls = get_url.findall(record)
url = urls[-1] if urls else None
latlngs = get_latlng.findall(record)
lating = latings[-1] if latings else None
block_record.append( {'title':title, 'url':url, 'lating':lating{ )


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Question about importing

2010-02-02 Thread Dave Angel


Eike Welk wrote:

On Tuesday February 2 2010 20:28:03 Grigor Kolev wrote:
  

Can I use something like this
#--
import sys
sys.path.append("/home/user/other")
import module
#-




Yes I think so. I just tried something similar:
--


IPython 0.10 -- An enhanced Interactive Python.

<--- snip >

In [1]: import sys

In [2]: 
sys.path.append("/home/eike/codedir/freeode/trunk/freeode_py/freeode/")


<--- snip >
<--- The next line is a special command of IPython: >

In [8]: !ls /home/eike/codedir/freeode/trunk/freeode_py/freeode/
ast.py   pygenerator.pyctest_1_interpreter.pyc   
test_pygenerator.pyc
ast.pyc  simlcompiler.pytest_2_interpreter.py  
test_simlcompiler.py
__init__.py  simlcompiler.pyc   test_2_interpreter.pyc 

<--- snip >



In [9]: import simlcompiler
---
ImportError   Traceback (most recent call last)

/home/eike/ in ()

/home/eike/codedir/freeode/trunk/freeode_py/freeode/simlcompiler.py in 
()

 36 import stat
 37 from subprocess import Popen #, PIPE, STDOUT
---> 38 import pyparsing
 39 import freeode.simlparser as simlparser
 40 import freeode.interpreter as interpreter

ImportError: No module named pyparsing


--
Well... the import fails, but it finds the module and starts to import it. 



HTH,
Eike.



  
I have no idea what freode looks like, but I have a guess, based on your 
error messages.


I'd guess that you want to append without the freeode directory:


sys.path.append("/home/eike/codedir/freeode/trunk/freeode_py/")

and import with it.  That's because freeode is a package name, not a 
directory name (I can tell because __init__.py is present)

 import freeode.simlcompiler

See if that works any better.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] SYS Long File Names?

2010-02-07 Thread Dave Angel


FT wrote:

Hi!

I was looking at the sys.argv(1) file name and it is the short 8 char
name. How do you place it into the long file name format? I was reading
music files and comparing the name to the directory listing and always comes
back as not found because the name was shortened.

So, how do you get it to read long file names?

Bruce


  
You need square brackets, not parentheses on the sys.argv.  But I'm 
guessing that's a typo in your message.


Some more information, please.  What version of Python, and what OS ?  
And how are you running this script?  If you're typing the script name 
at a DOS box, then the string you're seeing in sys.argv[1] is the one 
you typed on the command line.


If you start it some other way, please tell us how.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] NameError: global name 'celsius' is not defined (actually, solved)

2010-02-10 Thread Dave Angel


David wrote:
Hello 
Wesley,


thanks for your reply. I was surprised about the limited information 
too. Sadly (?), I can't reproduce the error any more...


David



On 10/02/10 11:13, wesley chun wrote:
I just wrote this message, but after restarting ipython all worked 
fine.
How is it to be explained that I first had a namespace error which, 
after a
restart (and not merely a new "run Sande_celsius-main.py"), went 
away? I

mean, surely the namespace should not be impacted by ipython at all!?
 :
# file: Sande_celsius-main.py
from Sande_my_module import c_to_f
celsius = float(raw_input("Enter a temperature in Celsius: "))
fahrenheit = c_to_f(celsius)
print "That's ", fahrenheit, " degrees Fahrenheit"

# this is the file Sande_my_module.py
# we're going to use it in another program
def c_to_f(celsius):
fahrenheit = celsius * 9.0 / 5 + 32
return fahrenheit

When I run Sande_celsius-main.py, I get the following error:

NameError: global name 'celsius' is not defined
WARNING: Failure executing file:



Python interpreters including the standard one or IPython should tell
you a lot more than that. how are you executing this code? would it be
possible to do so from the command-line? you should get a more verbose
error message that you can post here.

best regards,
-- wesley


Your response to Wesley should have been here, instead of at the top.  
Please don't top-post on this forum.


I don't use iPython, so this is just a guess.  But perhaps the problem 
is that once you've imported the code, then change it, it's trying to 
run the old code instead of the changed code.


Try an experiment in your environment.  Deliberately add an error to 
Sande_celsius-main.py, and run it.  Then correct it, and run it again, 
to see if it notices the fix.


The changes I'd try in this experiment are to first change the name on 
the celsius= line tocelsius2=
and after running and getting the error, change the following line to 
call celsius2().  If it gets an error, notice what symbol it complains 
about.


HTH
DaveA




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] A Stuborn Tab Problem in IDLE

2010-02-14 Thread Dave Angel


Wayne Watson wrote:
I got to 
the dos command line facility and got to the file. I executed the 
program, and it failed with a syntax error. I can't copy it out of the 
window to paste here, but here's the code surrounding the problem: 
(arrow ==> points at the problem.
The console code shows [ missing. I SEE the syntax error. It's two 
lines above the line with the arrow. The code now works. Thanks very 
much. Console wins again!


 (I suspect you are not into matplotlib, but the plot requires a list 
for x and y in plot(x,y). xy[0,0] turns out to be a float64, which the 
syntax rejects. I put [] around it, and it works. Is there a better way?


ax1.plot([xy[0,0]],[xy[0,1]],'gs')
if npts == 90: # exactly 90 frames
ax1.plot([xy[npts-1,0]], xy[npts-1,1]],'rs') # mark it is 
a last frame

else:
ax1.plot([xy[npts-1,0]], ==>[xy[npts-1,1]],'ys') # mark 
90th frame in path

last_pt = len(xy[:,0])
ax1.plot([xy[npts-1,0]],[xy[npts-1,1]],'rs')

On 2/14/2010 6:18 PM, Wayne Watson wrote:
Well, command line was easy to get to. It's on the menu for python, 
but it gives me >>>.  How do I get to the folder with the py file?  
Can I switch to a c:\  type operation?


Back to exploring.

On 2/14/2010 5:05 PM, Alan Gauld wrote:


"Wayne Watson"  wrote
When I use F5 to execute a py program in IDLE, Win7, I get a tab 
error on an indented else. 


What happens if you execute from a command line? Do you get the same 
error?

If so look at the lines before.
If not try closing and restarting IDLE

HTH,

Alan G

Once you've discovered the DOS box, you should also discover QuickEdit 
mode.  In the DOS box, right click on the title bar, and choose 
"Properties".  First tab is Options.  Enable Quick-Edit mode, and press 
OK.  Now, you can drag a rectangle on the DOS box, and use right click 
to paste it to the clipboard. Practice a bit and you'll find it easy.  
An essential tool.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Getting caller name without the help of "sys._getframe(1).f_code.co_name" ?

2010-02-15 Thread Dave Angel


patrice laporte wrote:

2010/2/14 Luke Paireepinart 

  

I see why you would want the error messages but why is the default error
message not enough, that is why I am curious, and typically introspection on
objects is not necessary (for example, people often want to convert a string
into a variable name to store a value (say they have the string "foobar1"
and they want to store the value "f" in the variable "foobar1", how do they
change foobar1 to reference a string?  well you can just use exec but the
core issue is that there's really no reason to do it in the first place,
they can just use a dictionary and store dict['foobar1'] = 'f'  and it is
functionally equivalent (without the danger in the code)).  I get the
feeling that your issue is the same sort of thing, where an easier solution
exists but for whatever reason you don't see it.  I don't know if this is
true or not.  Here's my take on this:



class x(object):
  

def __init__(self, fname):
self.temp = open(fname).read()




a = x('foobar')
  

Traceback (most recent call last):
  File "", line 1, in# this is the
module it's in
a = x('foobar')#
this is the line where I tried to initialize it
  File "", line 3, in __init__  # which called
this function, which is the one that has the error
self.temp = open(fname).read()  #and the error
occurred while trying to perform this operation
IOError: [Errno 2] No such file or directory: 'foobar'  #and the error was
that the file 'foobar' could not be found.




Hi and thank to everybody...

First of all, I consider my first question is now answered : I wanted to get
rid of that sys._getframe call, I got an explanation (thanks to Kent).

The rest of the discussion is not about Python, it's more about the way of
thinking how to help the user having à good feeling with your app.

I try to clarify my need and share you my anxiety. Of course, a lot of thing
are available with Python, from a coder point of view. But what I want to do
is to think about the user, and give him a way to understand that what he
did was wrong.

Traceback give me all I need, but my opinion is that it's not acceptable to
give it back to the user without a minimum of "décorating". I didn't yet
look at the logging module, and maybe it can help me to make that
décorating.

And the user must be a priority (it's still my conviction here)

My own experience is that there is too much coder that forget the app they
work on is aim to be used by "real human", not by C/C++/Python/put what ever
you want here/ guru : if your app popups to the user a message that is just
what the traceback gave, it's not a good thing : How can it be reasonable to
imagine the user will read that kinda message ? :

*Traceback (most recent call last):
  File "", line 1, in 
a = x('foobar')
  File "", line 3, in __init__
self.temp = open(fname).read()
IOError: [Errno 2] No such file or directory: 'foobar'
*

Of course the origin of his problem is in the message : "*No such file or
directory: 'foobar'*", but a customer will never read that @ù^$#é uggly
message, there is too much extraterrestrial words in it.

Traceback doesn' give more thant that, it doesn't say, as an example : we
(the name of app) was trying to open the file "foobar" in order to do
something with it (put here what it was supposed to do with the file) : app
failed to open it because "foobar" doen't exist.

According to me, traceback is what we need during "coding" phases, but it's
not something to give to the user.

This problem has to be solved by thinking the app in the way I'm trying to
explain  (but not only in that way) : think about the user. This is not
something I expect Python to do for me, I'm just looking for everything
Python can provide me to make me think about the user.

I'm new to Python, and I make a lot of exploration to understand and answer
myself to my question. Python library is huge, and I don't have as enough
time as I wanted to deal with it. But I'm conviced the solutions are here, I
don't try to re-invent the wheel...


Thant to you all.

  
This makes lots of sense.  If the message doesn't make sense to the 
user, there's no point.  But why then is your thread titled "Getting 
caller name" ?  Why does the user care about the caller (function) 
name?  When you started the thread, it seemed clear that your user was a 
programmer, presumably who was adding code to your system and who wanted 
to see context error messages in coding terms.


If you have a duality of users, consider using a "DEBUG" variable, that 
changes the amount of detail you display upon an error.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem with "input" in Python 3

2010-02-15 Thread Dave Angel


Peter Anderson wrote:

Hi!

I am trying to teach myself how to program in Python using Zelle's 
"Python Programming: An Introduction to Computer Science" (a very good 
text). At the same time I have decided to start with Python 3 (3.1.1). 
That means that I have to convert Zelle's example code to Python 3 
(which generally I cope with).


I'm hoping that somebody can help with what's probably a very simple 
problem. There is a quadratic equation example involving multiple user 
inputs from the one "input" statement. The code works fine with Python 
2.5 but when I convert it to Python 3 I get error messages. The code 
looks like:


05 import math
06
07 def main():
08 print("This program finds the real solutions to a quadratic\n")
09
10 a, b, c = input("Please enter the coefficients (a, b, c): ")
11
12 '''
13 a = int(input("Please enter the first coefficient: "))
14 b = int(input("Please enter the second coefficient: "))
15 c = int(input("Please enter the third coefficient: "))
16 '''
17
18 discrim = b * b - 4 * a * c
19 ...

25 main()

Lines 08 to 12 are my Python 3 working solution but line 06 does not 
work in Python 3. When it runs it produces:


Please enter the coefficients (a, b, c): 1,2,3
Traceback (most recent call last):
File "C:\Program Files\Wing IDE 101 
3.2\src\debug\tserver\_sandbox.py", line 25, in 
File "C:\Program Files\Wing IDE 101 
3.2\src\debug\tserver\_sandbox.py", line 10, in main

builtins.ValueError: too many values to unpack
>>>

Clearly the problem lies in the input statement. If I comment out line 
10 and remove the comments at lines 12 and 16 then the program runs 
perfectly. However, I feel this is a clumsy solution.


Could somebody please guide me on the correct use of "input" for 
multiple values.


Regards,
Peter
The input() function in Python3 produces a string, and does not evaluate 
it into integers, or into a tuple, or whatever.  See for yourself by trying


  print ( repr(input("prompt ")) )

on both systems.


You can subvert Python3's improvement by adding an eval to the return value.
  a, b, c = eval(input("Enter exactly three numbers, separated by commas"))

is roughly equivalent to Python 2.x  input expression.  (Python 3's 
input is equivalent to Python 2.x  raw_input)


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Input() is not working as expected in Python 3.1

2010-02-15 Thread Dave Angel


Yaraslau Shanhin wrote:

Hello All,

I am working with Python tutorial in wiki and one of the exercises is as
follows:

Ask the user for a string, and then for a number. Print out that string,
that many times. (For example, if the string is hello and the number is 3 you
should print out hellohellohello.)

Solution for this exercise is:

text = str(raw_input("Type in some text: "))
number = int(raw_input("How many times should it be printed? "))print
(text * number)



Since in Python raw_input() function was renamed to input() according
to PEP 3111  I have
respectively updated this code to:


text = str(input("Type in some text: "))
number = int(input("How many times should it be printed? "))print
(text * number)



However when I try to execute this code in Python 3.1 interpreter
error message is generated:


Type in some text: some
How many times should it be printed? 3
Traceback (most recent call last):
  File "test4.py", line 2, in 
number = int(input("How many times should it be printed? "))
ValueError: invalid literal for int() with base 10: 'How many times
should it be printed? 3'


Can you please advise me how to resolve this issue?

  
When I correct for your missing newline, it works for me.  I don't know 
of any version of Python which would copy the prompt string into the 
result value of input or raw_input() function.


Try pasting the exact console session, rather than paraphrasing it.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] fast sampling with replacement

2010-02-21 Thread Dave Angel




Luke Paireepinart wrote:

Can you explain what your function is doing and also post some test code to
profile it?

On Sat, Feb 20, 2010 at 10:22 AM, Andrew Fithian  wrote:

  

Hi tutor,

I'm have a statistical bootstrapping script that is bottlenecking on a
python function sample_with_replacement(). I wrote this function myself
because I couldn't find a similar function in python's random library. This
is the fastest version of the function I could come up with (I used
cProfile.run() to time every version I wrote) but it's not fast enough, can
you help me speed it up even more?

import random
def sample_with_replacement(list):
l = len(list) # the sample needs to be as long as list
r = xrange(l)
_random = random.random
return [list[int(_random()*l)] for i in r] # using
list[int(_random()*l)] is faster than random.choice(list)

FWIW, my bootstrapping script is spending roughly half of the run time in
sample_with_replacement() much more than any other function or method.
Thanks in advance for any advice you can give me.

-Drew

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



list and l are poor names for locals.  The former because it's a 
built-in type, and the latter because it looks too much like a 1.


You don't say how big these lists are, but I'll assume they're large 
enough that the extra time spent creating the 'l' and 'r' variables is 
irrelevant.


I suspect you could gain some speed by using random.randrange instead of 
multiplying random.random by the length.


And depending on how the caller is using the data, you might gain some 
by returning a generator expression instead of a list.  Certainly you 
could reduce the memory footprint.


I wonder why you assume the output list has to be the same size as the 
input list.  Since you're sampling with replacement, you're not using 
the whole list anyway.  So I'd have defined the function to take a 
second argument, the length of desired array.  And if you could accept a 
generator instead of a list, you don't care how long it is, so let it be 
infinite.


(untested)
def sample(mylist):
   mylistlen = len(mylist)
   randrange = random.randrange
   while True:
 yield mylist[ randrange(0, mylistlen)]

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-23 Thread Dave Angel

Wayne Watson wrote:
A few days ago I posted a message titled ""Two" Card Monty. The 
problem I mentioned looks legitimate, and remains puzzling. I've 
probed this in a newsgroup, and no one has an explanation that fits.

My claim is that if one creates a program in a folder that reads a 
file in the folder it and then copies it to another folder, it will 
read  the data file in the first folder, and not a changed file in the 
new folder. I'd appreciate it if some w7 users could try the program 
below, and let me know what they find.  I'm using IDLE in Win7 with Py 
2.5.

My experience is that if one checks the properties of the copied file, 
it will point to the original py file and execute it and not the copy. 
If win7 is the culprit, I would think this is a somewhat  serious 
problem. It may be the sample program is not representative of the 
larger program that has me stuck. If necessary I can provide it. It 
uses common modules. (Could this be something like the namespace usage 
of variables that share a common value?)

# Test program. Examine strange link in Python under Win7
# when copying py file to another folder.
# Call the program vefifywin7.py
# To verify my situation use IDLE, save and run this program there.
# Put this program into a folder along with a data file
# called verify.txt. Create a single text line with a few characters 
in it

# Run this program and note the output
# Copy the program and txt file to another folder
# Change the contents of the txt file
# Run it again, and see if the output is the same as in the other folder
track_file = open("verify.txt")
aline = track_file.readline();
print aline
track_file.close()

I find your English is very confusing.  Instead of using so many 
pronouns with confusing antecedents, try being explicit.

>My claim is that if one creates a program in a folder that reads a 
file in the folder

Why not say that you created a program and a data file in the same 
folder, and had the program read the data file?

>...in the folder it and then copies it to another folder

That first 'it' makes no sense, and the second 'it' probably is meant to 
be "them".  And who is it that does this copying?  And using what method?

> ... it will read  the data file in the first folder

Who will read the data file?  The first program, the second, or maybe 
the operator?

About now, I have to give up.  I'm guessing that the last four lines of 
your message were intended to be the entire program, and that that same 
program is stored in two different folders, along with data files having 
the same name but different first lines.  When you run one of these 
programs it prints the wrong version of the line.

You have lots of variables here, Python version, program contents, Idle, 
Windows version.  Windows 7 doesn't do any mysterious "linking," so I'd 
stop making that your working hypothesis.  Your problem is most likely 
the value of current directory ( os.getcwd() ).  And that's set 
according to at least three different rules, depending on what program 
launches Python.  If you insist on using Idle to launch it, then you'll 
have to convince someone who uses Idle to tell you its quirks.   Most 
likely it has a separate menu for the starting directory than for the 
script name & location.  But if you're willing to use the command line, 
then I could probably help, once you get a clear statement of the 
problem.  By default, CMD.EXE uses the current directory as part of its 
prompt, and that's the current directory Python will start in.

But the first things to do are probably to print out the value of  
os.getcwd(), and to add a slightly different print in each version of 
the program so you know which one is running.

Incidentally, I'd avoid ever opening a data file in "the current 
directory."  If I felt it important to use the current directory as an 
implied parameter to the program, I'd save it in a string, and build the 
full path to the desired file using  os.path.join() or equivalent.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Strange list behaviour in classes

2010-02-25 Thread Dave Angel


James Reynolds wrote:

Thank you! I think I have working in the right direction. I have one more
question related to this module.

I had to move everything to a single module, but what I would like to do is
have this class in a file by itself so I can call this from other modules.
when it was in separate modules it ran with all 0's in the output.

Here is the code in one module:

#import Statistics

class Statistics:
def __init__(self, *value_list):
self.value = value_list
self.square_list= []
 def mean(self, *value_list):
try :
ave = sum(self.value) / len(self.value)
except ZeroDivisionError:
ave = 0
return ave

def median(self, *value_list):
if len(self.value) <= 2:
n = self.mean(self.value)
elif len(self.value) % 2 == 1:
m = (len(self.value) - 1)/2
n = self.value[m+1]
else:
m = len(self.value) / 2
m = int(m)
n = (self.value[m-1] + self.value[m]) / 2
return n
 def variance(self, *value_list):
average = self.mean(*self.value)
for n in range(len(self.value)):
square = (self.value[n] - average)**2
self.square_list.append(square)
try:
var = sum(self.square_list) / len(self.square_list)
except ZeroDivisionError:
var = 0
return var

def stdev(self, *value_list):
var = self.variance(*self.value)
sdev = var**(1/2)
return sdev
 def zscore(self, x, *value_list):
average = self.mean(self.value)
sdev = self.stdev(self.value)
try:
z = (x - average) / sdev
except ZeroDivisionError:
z = 0
return z



a = [1,2,3,4,5,6,7,8,9,10]
stats = Statistics(*a)
mean = stats.mean(*a)
median = stats.median(*a)
var = stats.variance(*a)
stdev = stats.stdev(*a)
z = stats.zscore(5, *a)
print(mean, median, var, stdev, z)
print()



On Wed, Feb 24, 2010 at 7:33 PM, Alan Gauld wrote:

  

"James Reynolds"  wrote

 I understand, but if self.value is any number other then 0, then the "for"


will append to the square list, in which case square_list will always have
some len greater than 0 when "value" is greater than 0?

  

And if value does equal zero?

Actually I'm confused by value because you treat it as both an
integer and a collection in different places?


 Is this an occasion which is best suited for a try:, except statement? Or


should it, in general, but checked with "if's". Which is more expensive?

  

try/except is the Python way :-)


 def variance(self, *value_list):


  if self.value == 0:
   var = 0
  else:
average = self.mean(*self.value)
for n in range(len(self.value)):
 square = (self.value[n] - average)**2
 self.square_list.append(square)
   var = sum(self.square_list) / len(self.square_list)
   return var

  

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/



The indentation in your code is lost when I look for it --  everything's 
butted up against the left margin except for a single space before def 
variance.  This makes it very hard to follow, so I've ignored the thread 
till now.  This may be caused by the mail digest logic, or it may 
because you're posting online, and don't tell it to leave the code 
portion unformatted.  But either way, you should find a way to leave the 
code indented as Python would see it.  If you're posting by mail, be 
sure and send it as text.


But a few things I notice in your code:   You keep using the * notation 
on your formal parameters.  That's what turns a list into a tuple.  And 
you pass those lists into methods  (like median()) which already have 
access to the data in the object, which is very confusing.  If the 
caller actually passes something different there, he's going to be 
misled, since the argument is ignored.


Also, in method variance() you append to the self.square_list.  So if it 
gets called more than once, the list will continue to grow.  Since 
square_list is only referenced within the one method, why not just 
define it there, and remove it as a instance attribute?


If I were you, I'd remove the asterisk from both the __init__() method 
parameter, and from the caller in top-level code.  You're building a 
list, and passing it.  Why mess with turning it into multiple arguments, 
and then back to a tuple?   Then I'd remove the spurious arguments to 
mean(), variance(), stdev() and zscore().  There are a few other things, 
but this should make it cleaner.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-27 Thread Dave Angel




Wayne Watson wrote:
Ok, I'm back after a three day trip. You are correct about the use of 
pronouns and a few misplaced words. I should have reread what I wrote. 
I had described this in better detail elsewhere, and followed that 
description with the request here probably thinking back to it.  I 
think I was getting a bit weary of trying to find an answer. Try t;his.



Folder1
   track1.py
  data1.txt
  data2.txt
  data3.txt

Folder2
   track1.py
   dset1.txt
   dset2.txt
   ...
   dset8.txt

So how do you know this is the structure?  If there really are shortcuts 
or symbol links, why aren't you showing them?   Did you do a DIR from 
the command line, to see what's there?  Or are you looking in Explorer, 
which doesn't even show file extensions by default, and just guessing 
what's where ?


data and dset files have the same record formats. track1.py was copied 
into  Folder2 with ctrl-c + ctrl-v. 


Those keys don't work from a command prompt.  From there, you'd use COPY 
or something similar.  So I have to guess you were in an Explorer 
window, pointing to Folder 1, and you selected the python file, and 
pressed Ctrl-C.  Then you navigated to Folder 2, and pressed Ctrl-V.  If 
you did,  Windows 7 wouldn't have created any kind of special file, any 
more than earlier ones did.  Chances are you actually did something 
else.  For example, you might have used a right-click drag, and answered 
"create shortcut" when it asked what you wanted to do.  Or perhaps you 
did drag/drop with some ctrl-or alt-key modifier.


Anyway, you need to be more explicit about what you did.  If you had 
used a command prompt, you could at least have pasted the things you 
tried directly to your message, so we wouldn't have so much guessing to do.
When I run track1.py from folder1,  it clearly has examined the 
data.txt  files. 
And how are you running track1.py ?  And how do you really know that's 
what ran?  The code you posted would display a string, then the window 
would immediately go away, so you couldn't read it anyway.
If I run the copy of track1.py in folder2, it clearly operates on 
folder1 (one) data.txt files. This should not be.


If I look at  the  properties of track1.py in folder2  (two), it  is  
pointing back to the program in folder1 (one).
Exactly what do you mean by "pointing back" ?  If you're getting icons 
in your Explorer view, is there a little arrow in the corner?  When you 
did the properties, did you see a tab labeled "shortcut" ?



I do not believe I've experienced this sort of linkage in any WinOS 
before. I believed I confirmed that the same behavior occurs using cmd 
prompt.


Shortcuts have been in Windows for at least 20 years.  But you still 
haven't given enough clues about what you're doing.

I'll now  head for Alan's reply.

On 2/23/2010 5:35 PM, Dave Angel wrote:


Wayne Watson wrote:
A few days ago I posted a message titled ""Two" Card Monty. The 
problem I mentioned looks legitimate, and remains puzzling. I've 
probed this in a newsgroup, and no one has an explanation that fits.


My claim is that if one creates a program in a folder that reads a 
file in the folder it ... then copies it to another folder, it will 
read  the data file in the first folder, and not a changed file in 
the new folder. I'd appreciate it if some w7 users could try the 
program below, and let me know what they find.  I'm using IDLE in 
Win7 with Py 2.5.


My experience is that if one checks the properties of the copied 
file, it will point to the original py file and execute it and not 
the copy. If win7 is the culprit, I would think this is a somewhat  
serious problem. It may be the sample program is not representative 
of the larger program that has me stuck. If necessary I can provide 
it. It uses common modules. (Could this be something like the 
namespace usage of variables that share a common value?)


# Test program. Examine strange link in Python under Win7
# when copying py file to another folder.
# Call the program vefifywin7.py
# To verify my situation use IDLE, save and run this program there.
# Put this program into a folder along with a data file
# called verify.txt. Create a single text line with a few characters 
in it

# Run this program and note the output
# Copy the program and txt file to another folder
# Change the contents of the txt file
# Run it again, and see if the output is the same as in the other 
folder

track_file = open("verify.txt")
aline = track_file.readline();
print aline
track_file.close()

I find your English is very confusing.  Instead of using so many 
pronouns with confusing antecedents, try being explicit.


>My claim is that if one creates a program in a folder that reads a 
file in the folder


Why not say that you created a program and a data file in the same 
folder, and had the program read the dat

Re: [Tutor] Verifying My Troublesome Linkage Claim between Python and Win7

2010-02-28 Thread Dave Angel




Wayne Watson wrote:



You tell us to "try this" and give a folder structure:

Folder1
 track1.py
 data1.txt
 data2.txt
 data3.txt
Folder2
 track1.py
 dset1.txt
 dset2.txt
 ...
 dset8.txt



Maybe one simple test at a time will get better responses.  Since you 
wouldn't tell me what tabs you saw in Explorer when you did properties, 
maybe you'll tell me what you see in CMD.


Go to a cmd prompt (DOS prompt), change to the Folder2 directory, and 
type dir.   paste that result, all of it, into a message.  I suspect 
you'll see that you don't have track1.py there at all, but track1.py.lnk


If so, that's a shortcut.  The only relevant change in Win7 that I know 
of is that Explorer shows shortcuts as "link" rather than "shortcut."



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Verifying My Troublesome ...+Properties

2010-02-28 Thread Dave Angel


Wayne Watson wrote:
(I sent the msg below to Steven and the list a moment ago, since msgs 
going to the list with attachments either don't post or take lots of 
time to post, I'm sending both of you this copy.)


Steven, attached are three jpg files showing the properties of the two 
py files. The two files are identical in name, ReportingToolA.py,  and 
content, but are in folders .../Events2008_NovWW and .../events. Two 
of the jpg files are for the General Tab and Shortcut Tab of the py 
file in ../events. The other jpg is for the file in 
.../Events2008_NovWW, which has no shortcut tab. In previous 
descriptions, this is like:


Folder1 is Events2008_NovWW
Folder2 is events

I developed RT.py (ReportingToolA.py) in the .../Events2008_NovWW 
folder and copied it to ../events. The shortcut shows the events 
folder RT.py file is really in Events20008_WW


I have no idea why the RT.py file shows a shortcut. I just took a file 
called junk.jpg, and right-clicked on it. I selected Shortcut from the 
list, and it produced a file junk.jpg-shortcut. It is quite obvious 
the file name is different. If I select Copy instead, and paste the 
file into a folder called  Junk, there is no shortcut created. A drag 
and drop results in a move,and not a copy, so that's out of the picture.


I have no idea how the RT.py file ever got to be a shortcut.
As I said many messages ago, if your Properties dialog has a tab called 
Shortcut, then this is a shortcut file, not a python file.  I still 
don't know how you created it, but that's your "anomaly," not Windows 7, 
and certainly not Python.  Further, the name isn't  RT.py, since 
shortcuts have other extensions (such as .lnk) that Explorer hides from 
you, in its infinite "helpfulness."  It does give you several clues, 
however, such as the little arrow in the icon.  You can see that without 
even opening the properties window, but it's repeated in that window as 
well.


And Explorer is just a tool.  The command prompt should be your home 
base as a programmer.  When something goes wrong running a program from 
the either other ways, always check it at the command prompt, because 
every other tool has quirks it introduces into the equation.


My best guess on how you created that shortcut was by using Alt-Drag.  
As you point out, drag does a move by default, if it's on the same 
drive.  Ctrl-Drag will force a copy, even on the same drive.  And 
Shift-Drag will force a move, even if it's on a different drive.


These rules didn't change between XP and Windows 7, as far as I know, 
although in some places Explorer calls it "Link" instead of 
"Shortcut".   But that's just a self inconsistency.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] List comprehension possible with condition statements?

2010-03-03 Thread Dave Angel


Jojo Mwebaze wrote:

Hi There,

i would like to implement the following in lists

assuming

x = 3
y = 4
z = None

i want to create a dynamic list such that

mylist = [ x , y, z ] ,   if z in not None

if z is None then

mylist = [x,y]

Anyhelp!

cheers

Jojo

  


Are there any constraints on x and y ?  If you want to throw out all 
None values, then it's a ready problem.  You try it, and if it doesn't 
quite work, post the code. We'll try to help.


But if only the third value is special, then there's little point in 
making a comprehension of one value.  Just conditionally append the z 
value to the list containing x and y.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Bowing out

2010-03-03 Thread Dave Angel


Kent Johnson wrote:

Hi all,

After six years of tutor posts my interest and energy have waned and
I'm ready to move on to something new. I'm planning to stop reading
and contributing to the list. I have handed over list moderation
duties to Alan Gauld and Wesley Chun.

Thanks to everyone who contributes questions and answers. I learned a
lot from my participation here.

So long and keep coding!
Kent

  
I'm sorry to see you go as well.  I've learned an awful lot from your 
posts over the couple of years I've been here.


Thanks for all the efforts.
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting algorithm

2010-03-03 Thread Dave Angel


C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the thread 
it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who described 
how to go about sorting. As it turned out the description he described 
was the Bubble sort algorithm. Since coding the solution I know the 
Bubble sort is inefficient because of repeated iterations over the 
entire list. This shed light on the quick sort algorithm which I'd 
like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list into 
smaller lists. Sort each list, then go back and use the same swapping 
procedure for each of the different lists. My question is, at what 
point to you start breaking things up? Is that based on list elements 
or is it based on memory(?) resources python is using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including a 
video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great learning 
exercise to implement Python code directly from that description, 
without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not going 
to quibble about whether this is the "right" one to be called that.)


I don't know what your earlier thread was, since you don't mention the 
subject line, but there are a number of possible reasons you might not 
have wanted to use the built-in sort.  The best one is for educational 
purposes.  I've done my own sort for various reasons in the past, even 
though I had a library function, since the library function had some 
limits.  One time I recall, the situation was that the library sort was 
limited to 64k of total data, and I had to work with much larger arrays 
(this was in 16bit C++, in "large" model).  I solved the size problem by 
using the  C++ sort library on 16k subsets (because a pointer was 2*2 
bytes).  Then I merged the results of the sorts.  At the time, and in 
the circumstances involved, there were seldom more than a dozen or so 
sublists to merge, so this approach worked well enough.


Generally, it's better for both your development time and the efficiency 
and reliabilty of the end code, to base a new sort mechanism on the 
existing one.  In my case above, I was replacing what amounted to an 
insertion sort, and achieved a 50* improvement for a real customer.  It 
was fast enough that other factors completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to your 
other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the so 
called Order Function.  In a bubble sort, if you double the size of the 
array, you quadruple the number of comparisons and swaps.  It's order 
N-squared or O(n*n).   So what works well for an array of size 10 might 
take a very long time for an array of size 1 (like a million times 
as long).  You can do much better by sorting smaller lists, and then 
combining them together.  Such an algorithm can  be O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, the 
answer is when the list is size 3 or more.  For anything larger than 2, 
you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general case, 
that's undefined.  I think it's safe in this case, but I would avoid it 
anyway, by just using xrange(len(list_)-1) to iterate through it.  You 
use the index function to find something you would already know -- the 
index function is slow.  And the first try/except isn't needed if you 
use a -1 in the xrange argument, as I do above.


You use pop() and push() to exchange two adjacent items in the list.  
Both operations copy the remainder of the list, so they're rather slow.  
Since you're exchanging two items in the list, you can simply do that:

list[pos1], list[pos2] = list[pos2], list[pos1]

That also eliminates the need for the second try/except.

You mention being bothered by the while loop.  You could replace it with 
a simple for loop with xrange(len(l

Re: [Tutor] Encoding

2010-03-03 Thread Dave Angel


Giorgio wrote:


 Depends on your python version. If you use python 2.x, you have to use a
  

u before the string:

s = u'Hallo World'




Ok. So, let's go back to my first question:

s = u'Hallo World' is unicode in python 2.x -> ok
s = 'Hallo World' how is encoded?

  

Since it's a quote literal in your source code, it's encoded by your 
text editor when it saves the file, and you tell Python which encoding 
it was by the second line of your source file, right after the shebang line.


A sequence of bytes in an html file should be should have its encoding 
identified by the tag at the top of the html file.  And I'd  *guess* 
that on a form result, the encoding can be assumed to match that of the 
html of the form itself.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting algorithm

2010-03-03 Thread Dave Angel

(You forgot to do a Reply-All, so your message went to just me, rather 
than to me and the list )



C.T. Matsumoto wrote:

Dave Angel wrote:

C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the 
thread it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who described 
how to go about sorting. As it turned out the description he 
described was the Bubble sort algorithm. Since coding the solution I 
know the Bubble sort is inefficient because of repeated iterations 
over the entire list. This shed light on the quick sort algorithm 
which I'd like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list into 
smaller lists. Sort each list, then go back and use the same 
swapping procedure for each of the different lists. My question is, 
at what point to you start breaking things up? Is that based on list 
elements or is it based on memory(?) resources python is using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including a 
video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great 
learning exercise to implement Python code directly from that 
description, without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not 
going to quibble about whether this is the "right" one to be called 
that.)


I don't know what your earlier thread was, since you don't mention 
the subject line, but there are a number of possible reasons you 
might not have wanted to use the built-in sort.  The best one is for 
educational purposes.  I've done my own sort for various reasons in 
the past, even though I had a library function, since the library 
function had some limits.  One time I recall, the situation was that 
the library sort was limited to 64k of total data, and I had to work 
with much larger arrays (this was in 16bit C++, in "large" model).  I 
solved the size problem by using the  C++ sort library on 16k subsets 
(because a pointer was 2*2 bytes).  Then I merged the results of the 
sorts.  At the time, and in the circumstances involved, there were 
seldom more than a dozen or so sublists to merge, so this approach 
worked well enough.


Generally, it's better for both your development time and the 
efficiency and reliabilty of the end code, to base a new sort 
mechanism on the existing one.  In my case above, I was replacing 
what amounted to an insertion sort, and achieved a 50* improvement 
for a real customer.  It was fast enough that other factors 
completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to your 
other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the so 
called Order Function.  In a bubble sort, if you double the size of 
the array, you quadruple the number of comparisons and swaps.  It's 
order N-squared or O(n*n).   So what works well for an array of size 
10 might take a very long time for an array of size 1 (like a 
million times as long).  You can do much better by sorting smaller 
lists, and then combining them together.  Such an algorithm can  be 
O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, 
the answer is when the list is size 3 or more.  For anything larger 
than 2, you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general case, 
that's undefined.  I think it's safe in this case, but I would avoid 
it anyway, by just using xrange(len(list_)-1) to iterate through it.  
You use the index function to find something you would already know 
-- the index function is slow.  And the first try/except isn't needed 
if you use a -1 in the xrange argument, as I do above.


You use pop() and push() to exchange two adjacent items in the list.  
Both operations copy the remainder of the list, so they're rather 
slow.  Since you're exchangin

Re: [Tutor] lazy? vs not lazy? and yielding

2010-03-03 Thread Dave Angel


John wrote:

Hi,

I just read a few pages of tutorial on list comprehenion and generator 
expression.  From what I gather the difference is "[ ]" and "( )" at the 
ends, better memory usage and the something the tutorial labeled as "lazy 
evaluation".  So a generator 'yields'.  But what is it yielding too?  


John

  
A list comprehension builds a whole list at one time.  So if the list 
needed is large enough in size, it'll never finish, and besides, you'll 
run out of memory and crash.  A generator expression builds a function 
instead which *acts* like a list, but actually doesn't build the values 
till you ask for them.  But you can still do things like

   for item in  fakelist:

and it does what you'd expect.


You can write a generator yourself, and better understand what it's 
about.  Suppose you were trying to build a "list" of the squares of the 
integers between 3 and 15.  For a list of that size, you could just use 
a list comprehension.  But pretend it was much larger, and you couldn't 
spare the memory or the time.


So let's write a generator function by hand, deliberately the hard way.

def mygen():
   i = 3
   while i < 16:
   yield i*i
   i += 1
   return

This function is a generator, by virtue of that yield statement in it.  
When it's called, it does some extra magic to make it easy to construct 
a loop.


If you now use
for item in mygen():
  print item

Each time through the loop, it executes one more iteration of the 
mygen() function, up to the yield statement.  And the value that's put 
into item comes from the yield statement.


When the mygen() function returns (or falls off the end), it actually 
generates a special exception that quietly terminates the for/loop.


Now, when we're doing simple expressions for a small number of values, 
we should use a list comprehension.  When it gets big enough, switch to 
a generator expression.  And if it gets complicated enough, switch to a 
generator function.  The point here is that the user of the for/loop 
doesn't care which way it was done.


Sometimes you really need a list.  For example, you can't generally back 
up in a generator, or randomly access the [i] item.  But a generator is 
a very valuable mechanism to understand.


For a complex example, consider searching a hard disk for a particular 
file.  Building a complete list might take a long time, and use a lot of 
memory.  But if you use a generator inside a for loop, you can terminate 
(break) when you meet some condition, and the remainder of the files 
never had to be visited.  See os.walk()


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encoding

2010-03-03 Thread Dave Angel

(Don't top-post.  Put your response below whatever you're responding to, 
or at the bottom.)


Giorgio wrote:

Ok.

So, how do you encode .py files? UTF-8?

2010/3/3 Dave Angel 

  
I personally use Komodo to edit my python source files, and tell it to 
use UTF8 encoding.  Then I add a encoding line as the second line of the 
file.  Many times I get lazy, because mostly my source doesn't contain 
non-ASCII characters.  But if I'm copying characters from an email or 
other Unicode source, then I make sure both are set up.  The editor will 
actually warn me if I try to save a file as ASCII with any 8 bit 
characters in it.


Note:  unicode is 16 bit characters, at least in CPython 
implementation.  UTF-8 is an 8 bit encoding of that Unicode, where 
there's a direct algorithm to turn 16 or even 32 bit Unicode into 8 bit 
characters.  They are not the same, although some people use the terms 
interchangeably.


Also note:  An 8 bit string  has no inherent meaning, until you decide 
how to decode it into Unicode.  Doing explicit decodes is much safer, 
rather than assuming some system defaults.  And if it happens to contain 
only 7 bit characters, it doesn't matter what encoding you specify when 
you decode it.  Which is why all of us have been so casual about this.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] object representation

2010-03-04 Thread Dave Angel


spir wrote:

Hello,

In python like in most languages, I guess, objects (at least composite ones -- 
I don't know about ints, for instance -- someone knows?) are internally 
represented as associative arrays. Python associative arrays are dicts, which 
in turn are implemented as hash tables. Correct?
Does this mean that the associative arrays representing objects are implemented 
like python dicts, thus hash tables?

I was wondering about the question because I guess the constraints are quite 
different:
* Dict keys are of any type, including heterogeneous (mixed). Object keys are 
names, ie a subset of strings.
* Object keys are very stable, typically they hardly change; usually only 
values change. Dicts often are created empty and fed in a loop.
* Objects are small mappings: entries are explicitely written in code (*). 
Dicts can be of any size, only limited by memory; they are often fed by 
computation.
* In addition, dict keys can be variables, while object keys rarely are: they 
are literal constants (*).

So, I guess the best implementations for objects and dicts may be quite 
different. i wonder about alternatives for objects, in particuliar trie and 
variants: http://en.wikipedia.org/wiki/Trie, because they are specialised for 
associative arrays which keys are strings.

denis

PS: Would someone point me to typical hash funcs for string keys, and the one 
used in python?

  

http://effbot.org/zone/python-hash.htm

(*) Except for rare cases using setattr(obj,k,v) or obj.__dict__[k]=v.
  
Speaking without knowledge of the actual code implementing CPython (or 
for that matter any of the other dozen implementations), I can comment 
on my *model* of how Python "works."  Sometimes it's best not to know 
(or at least not to make use of) the details of a particular 
implementation, as your code is more likely to port readily to the next 
architecture, or even next Python version.


I figure every object has exactly three items in it:  a ref count, a 
implementation pointer, and a payload. The payload may vary between 4 
bytes for an int object, and about 4 megabytes for a list of size a 
million.  (And of course arbitrarily large for larger objects).


You can see that these add up to 12 bytes (in version 2.6.2 of CPython) 
for an int by using sys.getsizeof(92).  Note that if the payload refers 
to other objects, those sizes are not included in the getsizeof() 
function result.  So getsizeof(a list of strings) will not show the 
sizes of the strings, but only the list itself.


The payload for a simple object will contain just the raw data of the 
object.  So for a string, it'd contain the count and the bytes.  For 
compound objects that can change in size, it'd contain a pointer to a 
malloc'ed buffer that contains the variable-length data.  The object 
stays put, but the malloc'ed buffer may move as it size grows and 
shrinks.  getsizeof() is smart enough to report not only the object 
itself, but the buffer it references. Note that buffer is referenced by 
only one object, so its lifetime is intimately tied up with the object's.


The bytes in the payload are meaningless without the implementation 
pointer.   That implementation pointer will be the same for all 
instances of a particular type.  It points to a structure that defines a 
particular type (or class).  That structure for an empty class happens 
to be 452 bytes, but that doesn't matter much, as it only appears once 
per class.  The instance of an empty class is only 32 bytes.  Now, even 
that might seem a bit big, so Python offers the notion of slots, which 
reduces the size of each instance, at the cost of a little performance 
and a lot of flexibility.  Still, slots are important, because I suspect 
that's how built-ins are structured, to make the objects so small.


Now, some objects, probably most of the built-ins, are not extensible.  
You can't add new methods, or alter the behavior much.  Other objects, 
such as instances of a class you write, are totally and very flexible.  
I won't get into inheritance here, except to say that it can be tricky 
to derive new classes from built-in types.


So where do associative arrays come in?  One of the builtin types is a 
dictionary, and that is core to much of the workings of Python.  There 
are dictionaries in each class implementation (that 452 bytes I 
mentioned).  And there may be dictionaries in the instances themselves.  
There are two syntaxes to directly access these dictionaries, the "dot" 
notation and the bracket [] notation.  The former is a simple 
indirection through a special member called __dict__.


So the behavior of an object depends on its implementation pointer, 
which points to a  structure.  And parts of that structure ultimately 
point to C code whch does all the actual work.  But most of the work is 
some double- or triple-indirection which ultimately calls code.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or ch

Re: [Tutor] Encoding

2010-03-04 Thread Dave Angel




Giorgio wrote:

2010/3/4 spir 


Ok,so you confirm that:

s = u"ciao è ciao" will use the file specified encoding, and that

t = "ciao è ciao"
t = unicode(t)

Will use, if not specified in the function, ASCII. It will ignore the
encoding I specified on the top of the file. right?

  
A literal  "u" string, and only such a (unicode) literal string, is 
affected by the encoding specification.  Once some bytes have been 
stored in a 8 bit string, the system does *not* keep track of where they 
came from, and any conversions then (even if they're on an adjacent 
line) will use the default decoder.  This is a logical example of what 
somebody said earlier on the thread -- decode any data to unicode as 
early as possible, and deal only with unicode strings in the program.  
Then, if necessary, encode them into whatever output form immediately 
before (or while) outputting them.




Again, thankyou. I'm loving python and his community.

Giorgio




  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] object representation

2010-03-05 Thread Dave Angel


spir wrote:

On Thu, 04 Mar 2010 09:22:52 -0500
Dave Angel  wrote:

  
Still, slots are important, because I suspect 
that's how built-ins are structured, to make the objects so small.



Sure, one cannot alter their structure. Not even of a direct instance of 
:
  

o = object()
o.n=1


Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'object' object has no attribute 'n'

  
Now, some objects, probably most of the built-ins, are not extensible.  
You can't add new methods, or alter the behavior much.



This applies to any attr, not only methods, also plain "information":
  

s = "abc"
s.n=1


Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'str' object has no attribute 'n'


  
Other objects, 
such as instances of a class you write, are totally and very flexible.



conceptually this is equivalent to have no __slots__ slot. Or mayby they could 
be implemented using structs (which values would be pointers), instead of 
dicts. A struct is like a fixed record, as opposed to a dict. What do you 
think? On the implementation side, this would be much simpler, lighter, and 
more efficient.
Oh, this gives me an idea... (to implement so-called "value objects").

Denis
  
having not played much with slots, my model is quite weak there.  But I 
figure the dictionary is in the implementation structure, along with a 
flag saying that it's readonly.  Each item of such a dictionary would be 
an index into the fixed table in the object.  Like a struct, as you say, 
except that in C, there's no need to know the names of the fields at run 
time.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encoding

2010-03-05 Thread Dave Angel


Giorgio wrote:


Ok,so you confirm that:

s = u"ciao è ciao" will use the file specified encoding, and that

t = "ciao è ciao"
t = unicode(t)

Will use, if not specified in the function, ASCII. It will ignore the
encoding I specified on the top of the file. right?



  

A literal  "u" string, and only such a (unicode) literal string, is
affected by the encoding specification.  Once some bytes have been stored in
a 8 bit string, the system does *not* keep track of where they came from,
and any conversions then (even if they're on an adjacent line) will use the
default decoder.  This is a logical example of what somebody said earlier on
the thread -- decode any data to unicode as early as possible, and deal only
with unicode strings in the program.  Then, if necessary, encode them into
whatever output form immediately before (or while) outputting them.





 Ok Dave, What i don't understand is why:

s = u"ciao è ciao" is converting a string to unicode, decoding it from the
specified encoding but

t = "ciao è ciao"
t = unicode(t)

That should do exactly the same instead of using the specified encoding
always assume that if i'm not telling the function what the encoding is, i'm
using ASCII.

Is this a bug?

Giorgio
  
In other words, you don't understand my paragraph above.  Once the 
string is stored in t as an 8 bit string, it's irrelevant what the 
source file encoding was.  If you then (whether it's in the next line, 
or ten thousand calls later) try to convert to unicode without 
specifying a decoder, it uses the default encoder, which is a 
application wide thing, and not a source file thing.  To see what it is 
on your system, use sys.getdefaultencoding().


There's an encoding specified or implied for each source file of an 
application, and they need not be the same.  It affects string literals 
that come from that particular file. It does not affect any other 
conversions, as far as I know.  For that matter, many of those source 
files may not even exist any more by the time the application is run.


There are also encodings attached to each file object, I believe, though 
I've got no experience with that.  So sys.stdout would have an encoding 
defined, and any unicode strings passed to it would be converted using 
that specification.


The point is that there isn't just one global value, and it's a good 
thing.  You should figure everywhere characters come into  your program 
(eg. source files, raw_input, file i/o...) and everywhere characters go 
out of your program, and deal with each of them individually.  Don't 
store anything internally as strings, and you won't create the ambiguity 
you have with your 't' variable above.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encoding

2010-03-05 Thread Dave Angel


Giorgio wrote:

2010/3/5 Dave Angel 
  

In other words, you don't understand my paragraph above.




Maybe. But please don't be angry. I'm here to learn, and as i've run into a
very difficult concept I want to fully undestand it.


  
I'm not angry, and I'm sorry if I seemed angry.  Tone of voice is hard 
to convey in a text message.

Once the string is stored in t as an 8 bit string, it's irrelevant what the
source file encoding was.




Ok, you've said this 2 times, but, please, can you tell me why? I think
that's the key passage to understand how encoding of strings works. The
source file encoding affects all file lines, also strings.

Nope, not strings.  It only affects string literals.

 If my encoding is
UTF8 python will read the string "ciao è ciao" as 'ciao \xc3\xa8 ciao' but
if it's latin1 it will read 'ciao \xe8 ciao'. So, how can it be irrelevant?

I think the problem is that i can't find any difference between 2 lines
quoted above:

s = u"ciao è ciao"

and

t = "ciao è ciao"
c = unicode(t)

[**  I took the liberty of making the variable names different so I can refer 
to them **]
  
I'm still not sure whether your confusion is to what the rules are, or 
why the rules were made that way.  The rules are that an unqualified 
conversion, such as the unicode() function with no second argument, uses 
the default encoding, in strict mode.  Thus the error.


Quoting the help: 
"If no optional parameters are given, unicode() will mimic the behaviour 
of str() except that it returns Unicode strings instead of 8-bit 
strings. More precisely, if /object/ is a Unicode string or subclass it 
will return that Unicode string without any additional decoding applied.


For objects which provide a __unicode__() 
<../reference/datamodel.html#object.__unicode__> method, it will call 
this method without arguments to create a Unicode string. For all other 
objects, the 8-bit string version or representation is requested and 
then converted to a Unicode string using the codec for the default 
encoding in 'strict' mode.

"

As for why the rules are that, I'd have to ask you what you'd prefer.  
The unicode() function has no idea that t was created from a literal 
(and no idea what source file that literal was in), so it has to pick 
some coding, called the default coding.  The designers decided to use a 
default encoding of ASCII, because manipulating ASCII strings is always 
safe, while many functions won't behave as expected when given UTF-8 
encoded strings.  For example, what's the 7th character of t ?  That is 
not necessarily the same as the 7th character of s, since one or more of 
the characters in between might have taken up multiple bytes in s.  That 
doesn't happen to be the case for your accented character, but would be 
for some other European symbols, and certainly for other languages as well.

If you then (whether it's in the next line, or ten thousand calls later)
try to convert to unicode without specifying a decoder, it uses the default
encoder, which is a application wide thing, and not a source file thing.  To
see what it is on your system, use sys.getdefaultencoding().




And this is ok. Spir said that it uses ASCII, you now say that it uses the
default encoder. I think that ASCII on spir's system is the default encoder
so.


  
I don't know, but I think it's the default in every country, at least on 
version 2.6.  It might make sense to get some value from the OS that 
defined the locally preferred encoding, but then a program that worked 
fine in one locale might fail miserably in another.

The point is that there isn't just one global value, and it's a good thing.
 You should figure everywhere characters come into  your program (eg. source
files, raw_input, file i/o...) and everywhere characters go out of your
program, and deal with each of them individually.




Ok. But it always happen this way. I hardly ever have to work with strings
defined in the file.

  
Not sure what you mean by "the file."  If you mean the source file, 
that's what your examples are about.   If you mean a data file, that's 
dealt with differently.
  

Don't store anything internally as strings, and you won't create the
ambiguity you have with your 't' variable above.

DaveA




Thankyou Dave

Giorgio



  


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encoding

2010-03-07 Thread Dave Angel


Giorgio wrote:

2010/3/7 spir 

  
One more question: Amazon SimpleDB only accepts UTF8.


So, let's say i have to put into an image file:

  
Do you mean a binary file with image data, such as a jpeg?  In that 
case, an emphatic - NO.  not even close.

filestream = file.read()
filetoput = filestream.encode('utf-8')

Do you think this is ok?

Oh, of course everything url-encoded then

Giorgio


  
Encoding binary data with utf-8 wouldn't make any sense, even if you did 
have the right semantics for a text file. 

Next problem, 'file' is a built-in keyword.  So if you write what you 
describe, you're trying to call a non-static function with a class 
object, which will error.



Those two lines don't make any sense by themselves.  Show us some 
context, and we can more sensibly comment on them.  And try not to use 
names that hide built-in keywords, or Python stdlib names.


If you're trying to store binary data in a repository that only permits 
text, it's not enough to pretend to convert it to UTF-8.  You need to do 
some other escaping, such as UUENCODE, that transforms the binary data 
into something resembling text.  Then you may or may not need to encode 
that text with utf-8, depending on its character set.



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encoding

2010-03-07 Thread Dave Angel


Giorgio wrote:

2010/3/7 Dave Angel 

  

Those two lines don't make any sense by themselves.  Show us some context,
and we can more sensibly comment on them.  And try not to use names that
hide built-in keywords, or Python stdlib names.




Hi Dave,

I'm considering Amazon SimpleDB as an alternative to PGSQL, but i need to
store blobs.

Amazon's FAQs says that:

"Q: What kind of data can I store?
You can store any UTF-8 string data in Amazon SimpleDB. Please refer
to the Amazon
Web Services Customer Agreement <http://aws.amazon.com/agreement> for
details."

This is the problem. Any idea?


  

DaveA




Giorgio



  
You still didn't provide the full context.  Are you trying to do store 
binary data, or not?


Assuming you are, you could do the UUENCODE suggestion I made.  Or use 
base64:


base64.encodestring(/s/)   wlll turn binary data into (larger) binary 
data, also considered a string.  The latter is ASCII, so it's irrelevant 
whether it's considered utf-8 or otherwise.  You store the resulting 
string in your database, and use  base64.decodestring(s) to reconstruct 
your original.


There's 50 other ways, some more efficient, but this may be the simplest.

DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Communicate between a thread and the main program

2010-03-08 Thread Dave Angel


Plato P.B. wrote:

Hi all,
I have created a script in which i need to implement the communication
between the main program and a thread.
The thread looks for any newly created files in a particular directory. It
will be stored in a variable in the thread function. I want to get that name
from the main program.
How can i do it?

Thanks in Advance. :D
  
Don't store it in "a variable in the thread function," but in an 
instance attribute of the thread object.


Then the main program simply checks the object's attribute.  Since it 
launched the thread(s), it should know its (their) instances.  This way, 
the solution scales up as you add more threads, with different 
functionality.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting algorithm

2010-03-11 Thread Dave Angel


C.T. Matsumoto wrote:

Dave Angel wrote:
(You forgot to do a Reply-All, so your message went to just me, 
rather than to me and the list )



C.T. Matsumoto wrote:

Dave Angel wrote:

C.T. Matsumoto wrote:

Hello,

This is follow up on a question I had about algorithms. In the 
thread it was suggested I make my own sorting algorithm.


Here are my results.

#!/usr/bin/python

def sort_(list_):
   for item1 in list_:
   pos1 = list_.index(item1)
   pos2 = pos1 + 1
   try:
   item2 = list_[pos2]
   except IndexError:
   pass

   if item1 >= item2:
   try:
   list_.pop(pos2)
   list_.insert(pos1, item2)
   return True
   except IndexError:
   pass

def mysorter(list_):
   while sort_(list_) is True:
   sort_(list_)

I found this to be a great exercise. In doing the exercise, I got 
pretty stuck. I consulted another programmer (my dad) who 
described how to go about sorting. As it turned out the 
description he described was the Bubble sort algorithm. Since 
coding the solution I know the Bubble sort is inefficient because 
of repeated iterations over the entire list. This shed light on 
the quick sort algorithm which I'd like to have a go at.


Something I haven't tried is sticking in really large lists. I was 
told that with really large list you break down the input list 
into smaller lists. Sort each list, then go back and use the same 
swapping procedure for each of the different lists. My question 
is, at what point to you start breaking things up? Is that based 
on list elements or is it based on memory(?) resources python is 
using?


One thing I'm not pleased about is the while loop and I'd like to 
replace it with a for loop.


Thanks,

T


There are lots of references on the web about Quicksort, including 
a video at:

http://www.youtube.com/watch?v=y_G9BkAm6B8

which I think illustrates it pretty well.  It would be a great 
learning exercise to implement Python code directly from that 
description, without using the sample C++ code available.


(Incidentally, there are lots of variants of Quicksort, so I'm not 
going to quibble about whether this is the "right" one to be called 
that.)


I don't know what your earlier thread was, since you don't mention 
the subject line, but there are a number of possible reasons you 
might not have wanted to use the built-in sort.  The best one is 
for educational purposes.  I've done my own sort for various 
reasons in the past, even though I had a library function, since 
the library function had some limits.  One time I recall, the 
situation was that the library sort was limited to 64k of total 
data, and I had to work with much larger arrays (this was in 16bit 
C++, in "large" model).  I solved the size problem by using the  
C++ sort library on 16k subsets (because a pointer was 2*2 bytes).  
Then I merged the results of the sorts.  At the time, and in the 
circumstances involved, there were seldom more than a dozen or so 
sublists to merge, so this approach worked well enough.


Generally, it's better for both your development time and the 
efficiency and reliabilty of the end code, to base a new sort 
mechanism on the existing one.  In my case above, I was replacing 
what amounted to an insertion sort, and achieved a 50* improvement 
for a real customer.  It was fast enough that other factors 
completely dominated his running time.


But for learning purposes?  Great plan.  So now I'll respond to 
your other questions, and comment on your present algorithm.


It would be useful to understand about algorithmic complexity, the 
so called Order Function.  In a bubble sort, if you double the size 
of the array, you quadruple the number of comparisons and swaps.  
It's order N-squared or O(n*n).   So what works well for an array 
of size 10 might take a very long time for an array of size 1 
(like a million times as long).  You can do much better by sorting 
smaller lists, and then combining them together.  Such an algorithm 
can  be O(n*log(n)).



You ask at what point you consider sublists?  In a language like C, 
the answer is when the list is size 3 or more.  For anything larger 
than 2, you divide into sublists, and work on them.


Now, if I may comment on your code.  You're modifying a list while 
you're iterating through it in a for loop.  In the most general 
case, that's undefined.  I think it's safe in this case, but I 
would avoid it anyway, by just using xrange(len(list_)-1) to 
iterate through it.  You use the index function to find something 
you would already know -- the index function is slow.  And the 
first try/except isn't needed if you use a -1 in the xrange 
argument, as I do above.


You use pop() and push() to exchange two adjacent items in the 
list.  Both operations copy the remainder of the list, so they're 
rath

Re: [Tutor] sorting algorithm

2010-03-12 Thread Dave Angel


C.T. Matsumoto wrote:

I've change the code and I think I have what you were talking about.

def mysort(list_):

for i in xrange(0, len(list_)):

pos = i

for j in xrange(pos+1, len(list_)):

if list_[i] > list_[j]:

pos = j

list_[i], list_[j] = list_[j], list_[i]

I finally started to think that the while couldn't remain. But if I 
look at this the thing that I don't get is the 'xrange(pos+1, 
len(list_))' snippet. What confused me was how did a new position get 
passed xrange(), when I do not see where it that was happening. Is 
'pos' a reference to the original pos in the xrange snippet?


T

That loop is not what I was describing, but I think it's nearly 
equivalent in performance.  My loop was always swapping adjacent items, 
and it adjusted the ending limit as the data gets closer to sorted.  
This one adjusts the beginning value (pos) of the inner loop, as the 
data gets more sorted.  For some orderings, such as if the data is 
already fully sorted, my approach would  be much faster.


Your outer loop basically finds the smallest item in the list on each 
pass.  If the line pos=j didn't exist, the inner loop would always loop 
from the i+1 value to the end of the list.  But since we've already done 
a bunch of comparisons on the previous pass, no items before pos need be 
compared in the current pass.


I'm going to be quite busy for the next couple of days.  So if I don't 
respond to your next post quickly, please be patient.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Efficiency and speed

2010-03-20 Thread Dave Angel

(Please don't top-post.  It ruins the context for anyone else trying to 
follow it.  Post your remarks at the end, or immediately after whatever 
you're commenting on.)


James Reynolds wrote:

Here's another idea I had. I thought this would be slower than then the
previous algorithm because it has another for loop and another while loop. I
read that the overhead of such loops is high, so I have been trying to avoid
using them where possible.

def mcrange_gen(self, sample):
nx2 = self.nx1
for q in sample:
for a in nx2:
while a > q:
 pass
yield a
break


On Fri, Mar 19, 2010 at 3:15 PM, Alan Gauld wrote:

  

While loops and for loops are not slow, it's the algorithm that you're 
using that's slow. If a while loop is the best way to do the best 
algorithm, then it's fast.  Anyway, in addition to for and while, other 
"slow" approaches are find() and "in".


But slowest of all is a loop that never terminates, like the while loop 
in this example.  And once you fix that, the break is another problem, 
since it means you'll never do more than one value from sample.



In your original example, you seemed to be calling a bunch of methods 
that are each probably a single python statement.  I didn't respond to 
those, because I couldn't really figure what you were trying to do with 
them.  But now I'll comment in general terms.


Perhaps you should do something like:

zip together the original list with a range list, so you now have a list 
of tuples.  Then sort that new list.  Now loop through that sorted list 
of tuples, and loop up your bucket for each item.   That should be fast 
because  they're in order, and  you have the index to the original 
value, so you can store the bucket number somewhere useful.


HTH,
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] python magazine

2010-03-27 Thread Dave Angel




Lowell Tackett wrote:
>From the virtual desk of Lowell Tackett  



--- On Fri, 3/26/10, Benno Lang  wrote:

From: Benno Lang 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: tutor@python.org, "Bala subramanian" 
Date: Friday, March 26, 2010, 8:38 PM

On 27 March 2010 00:33, Lowell Tackett  wrote:
  

The Python Magazine people have now got a Twitter site--which includes a 
perhaps [telling] misspelling.


Obviously that's why they're looking for a chief editor - maybe it's
even a deliberate ploy.

I'm not sure if this affects others, but to me your replies appear
inside the quoted section of your mail, rather than beneath it. Would
you mind writing plain text emails to avoid this issue?

Thanks,
benno

Like this...?


  
No, there's still a problem.  You'll notice in this message that there 
are ">" symbols in front of your lines and benno's, and ">>" symbols in 
front of Lowell's.  (Some email readers will turn the > into vertical 
bar, but the effect is the same).  Your email program should be adding 
those upon a reply, so that your own message has one less > than the one 
to which you're replying.  Then everyone reading can see who wrote what, 
based on how many ">" or bars precede the respective lines.  Quotes from 
older messages have more of them.


Are you using "Reply-All" in your email program?  Or are you 
constructing a new message with copy/paste?


What email are you using?  Maybe it's a configuration setting somebody 
could help with.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] inter-module global variable

2010-03-28 Thread Dave Angel


spir # wrote:

Hello,

I have a main module importing other modules and defining a top-level variable, 
call it 'w' [1]. I naively thought that the code from an imported module, when 
called from main, would know about w, but I have name errors. The initial trial 
looks as follows (this is just a sketch, the original is too big and 
complicated):

# imported "code" module
__all__ ="NameLookup", "Literal", "Assignment", ...]

# main module
from parser import parser
from code import *
from scope import Scope, World
w = World()

This pattern failed as said above. So, I tried to "export" w:

# imported "code" module
__all__ ="NameLookup", "Literal", "Assignment", ...]

# main module
from parser import parser
from scope import Scope, World
w = World()
import code #new
code.w = w  ### "export"
from code import *

And this works. I had the impression that the alteration of the "code" module object would not 
propagate to objects imported from "code". But it works. But I find this terribly unclear, fragile, 
and dangerous, for any reason. (I find this "dark", in fact ;-)
Would someone try to explain what actually happens in such case?
Also, why is a global variable not actually global, but in fact only "locally" 
global (at the module level)?
It's the first time I meet such an issue. What's wrong in my design to raise 
such a problem, if any?

My view is a follow: From the transparency point of view (like for function transparency), the 
classes in "code" should _receive_ as general parameter a pointer to 'w', before they do 
anything. In other words, the whole "code" module is like a python code chunk 
parameterized with w. If it would be a program, it would get w as command-line parameter, or from 
the user, or from a config file.
Then, all instanciations should be done using this pointer to w. Meaning, as a 
consequence, all code objects should hold a reference to 'w'. This could be 
made as follows:

# main module
import code
code.Code.w =
from code import *

# "code" module
class Code(object):
w =None ### to be exported from importing module
def __init__(self, w=Code.w):
# the param allows having a different w eg for testing
self.w =
# for each kind of code things
class CodeThing(Code):
def __init__(self, args):
Code.__init__(self)
... use args ...
   def do(self, args):
   ... use args and self.w ...

But the '###' line looks like  an ugly trick to me. (Not the fact that it's a 
class attribute; as a contrary, I often use them eg for config, and find them a 
nice tool for clarity.) The issue is that Code.w has to be exported.
Also, this scheme is heavy (all these pointers in every living object.) 
Actually, code objects could read Code.w directly but this does not change much 
(and I lose transparency).
It's hard for me to be lucid on this topic. Is there a pythonic way?


Denis

[1] The app is a kind of interpreter for a custom language. Imported modules 
define classes for  objects representing elements of code (literal, assignment, 
...). Such objects are instanciated from parse tree nodes (conceptually, they 
*are* code nodes). 'w' is a kind of global scope -- say the state of the 
running program. Indeed, most code objects need to read/write in w.
Any comments on this model welcome. I have few knowledge on implementation of 
languages.


vit esse estrany ☣

spir.wikidot.com

  
The word 'global' is indeed unfortunate for those coming to python from 
other languages. In Python, it does just mean global to a single module. 
If code in other modules needs to access your 'global variable' they 
need normally need it to be passed to them.


If you really need a program-global value, then create a new module just 
for the purpose, and define it there. Your main program can initialize 
it, other modules can access it in the usual way, and everybody's happy. 
In general, you want import and initialization to happen in a 
non-recursive way. So an imported module should not look back at you for 
values. If you want it to know about a value, pass it, or assign it for 
them.


But Python does not have pointers. And you're using pointer terminology. 
Without specifying the type of w, you give us no clue whether you're 
setting yourself up for failure. For example, the first time somebody 
does a w= newvalue they have broken the connection with other module's w 
variable. If the object is mutable (such as a list), and somebody 
changes it by using w.append() or w[4] = newvalue, then no problem.


You have defined a class attribute w, and an instance attribute w, and a 
module variable w in your main script. Do these values all want to stay 
in synch as you change values? Or is it a constant that's just set up 
once? Or some combination, where existing objects want the original 
value, but new ones created after you change it will themselves get the 
value at the time of creation? You can get any of these b

Re: [Tutor] python magazine

2010-03-28 Thread Dave Angel

Lowell Tackett wrote:
>From the virtual desk of Lowell Tackett  

--- On Sat, 3/27/10, Dave Angel  wrote:

From: Dave Angel 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: "Benno Lang" , tutor@python.org
Date: Saturday, March 27, 2010, 6:12 AM

Lowell Tackett wrote:

>From the virtual desk of Lowell Tackett  

--- On Fri, 3/26/10, Benno Lang 

wrote:

From: Benno Lang 
Subject: Re: [Tutor] python magazine
To: "Lowell Tackett" 
Cc: tutor@python.org,

"Bala subramanian" 

Date: Friday, March 26, 2010, 8:38 PM

On 27 March 2010 00:33, Lowell Tackett 

wrote:

The Python Magazine people have now got a Twitter

site--which includes a perhaps [telling] misspelling.

Obviously that's why they're looking for a chief

editor - maybe it's

even a deliberate ploy.

I'm not sure if this affects others, but to me your

replies appear

inside the quoted section of your mail, rather than

beneath it. Would

you mind writing plain text emails to avoid this

issue?

Thanks,
benno

Like this...?

No, there's still a problem.  You'll notice in this
message that there are ">" symbols in front of your lines
and benno's, and ">>" symbols in front of
Lowell's.  (Some email readers will turn the > into
vertical bar, but the effect is the same).  Your email
program should be adding those upon a reply, so that your
own message has one less > than the one to which you're
replying.  Then everyone reading can see who wrote
what, based on how many ">" or bars precede the
respective lines.  Quotes from older messages have more
of them.

Are you using "Reply-All" in your email program?  Or
are you constructing a new message with copy/paste?

What email are you using?  Maybe it's a configuration
setting somebody could help with.

DaveA

Don't really know what I'm doing wrong (or right).  Just using the [email] tools that 
have been made available to me thru Yahoo mail and Firefox.  I began this text below your 
submission and "signature", and I'm using plain text, as suggested by a 
previous comment.  Don't know what else I could embellish this effort with.

This time it worked great.  You can see my comments at outermost level, 
with yours indented by one, and my previous one indented two, etc.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] inter-module global variable

2010-03-28 Thread Dave Angel


spir # wrote:

On Sun, 28 Mar 2010 21:50:46 +1100
Steven D'Aprano  wrote:

  

On Sun, 28 Mar 2010 08:31:57 pm spir ☣ wrote:


I'm going to assume you really want a single global value, and that you 
won't regret that assumption later.


We talked at length about how to access that global from everywhere that 
cares, and my favorite way is with a globals module. And it should be 
assigned something like:


globals.py:
class SomeClass (object):
def

def init(parameters):
global world
world = SomeClass(parameters, moreparamaters)

Then main can do the following:
import globals
globals.init(argv-stuff)

And other modules can then do
import globals.world as world

And they'll all see the same world variable. Nobody should have their 
own, but just import it if needed.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] commands

2010-03-28 Thread Dave Angel


Shurui Liu (Aaron Liu) wrote:

# Translate wrong British words

#Create an empty file
print "\nReading characters from the file."
raw_input("Press enter then we can move on:")
text_file = open("storyBrit.txt", "r+")
whole_thing = text_file.read()
print whole_thing
raw_input("Press enter then we can move on:")
print "\nWe are gonna find out the wrong British words."
corrections = {'colour':'color', 'analyse':'analyze',
'memorise':'memorize', 'centre':'center', 'recognise':'recognize',
'honour':'honor'}
texto = whole_thing
for a in corrections:
texto = texto.replace(a, corrections[a])
print texto

# Press enter and change the wrong words
if "colour" in whole_thing:
print "The wrong word is 'colour' and the right word is 'color'"
if "analyse" in whole_thing:
print "the wrong word is 'analyse' and the right word is 'analyze'"
if "memorise" in whole_thing:
print "the wrong word is 'memorise' and the right word is 'memorize'"
if "centre" in whole_thing:
print "the wrong word is 'centre' and the right word is 'center'"
if "recognise" in whole_thing:
print "the wrong word is 'recognise' and the right word is 'recognize'"
if "honour" in whole_thing:
print "the wrong word is 'honour' and the right word is 'honor'"

# We are gonna save the right answer to storyAmer.txt
w = open('storyAmer.txt', 'w')
w.write('I am really glad that I took CSET 1100.')
w.write('\n')
w.write('We get to analyse all sorts of real-world problems.\n')
w.write('\n')
w.write('We also have to memorize some programming language syntax.')
w.write('\n')
w.write('But, the center of our focus is game programming and it is fun.')
w.write('\n')
w.write('Our instructor adds color to his lectures that make them interesting.')
w.write('\n')
w.write('It is an honor to be part of this class!')
w = open("assign19/storyAmer.txt", "w")

w.close()



This is what I have done, I don't understand why this program cannot
fix "analyse".

  
You do some work in texto, and never write it to the output file.  
Instead you write stuff you hard-coded in literal strings in your program.


And you never fixed the mode field of the first open() function, as 
someone hinted at you.   And you don't specify the output file location, 
but just assume it's in the current directory.  For that matter, your 
open of the input file assumes it's in the current directory as well.  
But your assignment specified where both files would/should be.



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] what's wrong in my command?

2010-04-01 Thread Dave Angel


Shurui Liu (Aaron Liu) wrote:

# geek_translator3.py

# Pickle
import pickle

  
This is where you told it to load import.py.   Normally, that just 
quietly loads the standard module included with your system.



When I run it, the system gave me the feedback below:
Traceback (most recent call last):
  File "geek_translator3.py", line 4, in 
import pickle
  File "/usr/local/lib/python2.5/pickle.py", line 13, in 

AttributeError: 'module' object has no attribute 'dump'

I don't understand, I don't write anything about pickle.py, why it mentioned?
what's wrong with "import pickle"? I read many examples online whose
has "import pickle", they all run very well.
Thank you!

  
I don't have 2.5 any more, so I can't look at the same file you 
presumably have.  And line numbers will most likely be different in 
2.6.  In particular, there are lots of module comments at the beginning 
of my version of pickle.py.  You should take a look at yours, and see 
what's in line 13.   My guess it's a reference to the dump() function 
which may be defined in the same file.  Perhaps in 2.5 it was defined 
elsewhere.


Most common cause for something like this would be that pickle imports 
some module, and you have a module by that name in your current 
directory (or elsewhere on the sys.path).  So pickle gets an error after 
importing it, trying to use a global attribute that's not there.


Wild guess - do you have a file called marshal.py in your own code?

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] what's wrong in my command?

2010-04-01 Thread Dave Angel


Shurui Liu (Aaron Liu) wrote:

OK, can you tell me import.py is empty or not? If it's not an empty
document, what's its content?

  
(Please don't top-post,  Add your comments after what you're quoting, or 
at the end)


That was a typo in my message.  I should have said  pickle.py,  not 
import.py.  When you import pickle, you're tell it to find and load 
pickle.py.  That's python source code, and it will generally import 
other modules.  I was suspecting module.py.  But you should start by 
looking at line 13 of pickle.py



On Thu, Apr 1, 2010 at 5:45 AM, Dave Angel  wrote:
  

Shurui Liu (Aaron Liu) wrote:


# geek_translator3.py

# Pickle
import pickle


  

This is where you told it to load import.py.   Normally, that just quietly
loads the standard module included with your system.




When I run it, the system gave me the feedback below:
Traceback (most recent call last):
 File "geek_translator3.py", line 4, in 
   import pickle
 File "/usr/local/lib/python2.5/pickle.py", line 13, in 

AttributeError: 'module' object has no attribute 'dump'

I don't understand, I don't write anything about pickle.py, why it
mentioned?
what's wrong with "import pickle"? I read many examples online whose
has "import pickle", they all run very well.
Thank you!


  

I don't have 2.5 any more, so I can't look at the same file you presumably
have.  And line numbers will most likely be different in 2.6.  In
particular, there are lots of module comments at the beginning of my version
of pickle.py.  You should take a look at yours, and see what's in line 13.
My guess it's a reference to the dump() function which may be defined in the
same file.  Perhaps in 2.5 it was defined elsewhere.

Most common cause for something like this would be that pickle imports some
module, and you have a module by that name in your current directory (or
elsewhere on the sys.path).  So pickle gets an error after importing it,
trying to use a global attribute that's not there.

Wild guess - do you have a file called marshal.py in your own code?

DaveA







  


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] constructor

2010-04-04 Thread Dave Angel




Shurui Liu (Aaron Liu) wrote:

I am studying about how to create a constructor in a Python program, I
don't really understand why the program print out "A new critter has
been born!" and "Hi.  I'm an instance of class Critter." twice. I
guess is because "crit1 = Critter() crit2 = Critter()"  But I
don't understand how did computer understand the difference between
crit1 and crit2? cause both of them are equal to Critter(). Thank you!

# Constructor Critter
# Demonstrates constructors

class Critter(object):
"""A virtual pet"""
def __init__(self):
print "A new critter has been born!"

def talk(self):
print "\nHi.  I'm an instance of class Critter."

# main
crit1 = Critter()
crit2 = Critter()

crit1.talk()
crit2.talk()

raw_input("\n\nPress the enter key to exit.")


  

Critter is a class, not a function.  So the syntax
  crit1 = Critter()

is not calling a "Critter" function but constructing an instance of the 
Critter class.  You can tell that by doing something like

 print crit1
 print crit2

Notice that although both objects have the same type (or class), they 
have different ID values.


Since you supply an __init__() method in the class, that's called during 
construction of each object.  So you see that it executes twice.


Classes start to get interesting once you have instance attributes, so 
that each instance has its own "personality."  You can add attributes 
after the fact, or you can define them in __init__().  Simplest example 
could be:


crit1.name = "Spot"
crit2.name = "Fido"

Then you can do something like
 print crit1.name
 print crit2.name

and you'll see they really are different.

DaveA



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Matching zipcode in address file

2010-04-04 Thread Dave Angel


Alan Gauld wrote:


"TGW"  wrote


I go the program functioning with
lines = [line for line in infile if line[149:154] not in match_zips]

But this matches records that do NOT match zipcodes. How do I get 
this  running so that it matches zips?



Take out the word 'not' from the comprehension?

That's one change.  But more fundamental is to change the file I/O.  
Since there's no seek() operation, the file continues wherever it left 
off the previous time.


I'd suggest reading the data from the match_zips into a list, and if the 
format isn't correct, doing some post-processing on it.  But there's no 
way to advise on that since we weren't given the format of either file.


zipdata = match_zips.readlines()
Then you can do an  if XXX in zipdata with assurance.

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Extracting lines in a file

2010-04-06 Thread Dave Angel


ranjan das wrote:

Hi,


I am new to python, and specially to file handling.

I need to write a program which reads a unique string in a file and
corresponding to the unique string, extracts/reads the n-th line (from the
line in which the unique string occurs).

I say 'n-th line' as I seek a generalized way of doing it.

For instance lets say the unique string is "documentation" (and
"documentation" occurs more than once in the file). Now, on each instance
that the string "documentation" occurs in the file,  I want to read the 25th
line (from the line in which the string "documentation" occurs)

Is there a goto kind of function in python?

Kindly help

  
You can randomly access within an open file with the seek() function.  
However, if the lines are variable length, somebody would have to keep 
track of where each one begins, which is rather a pain.  Possibly worse, 
on Windows, if you've opened the file in text mode, you can't just count 
the characters you get, since 0d0a is converted to 0a before you get 
it.  You can still do it with a combination of seek() and tell(), however.


Three easier possibilities, if any of them applies:

1) If the lines are fixed in size, then just randomly access using 
seek() before the read.


2) If the file isn't terribly big, read it into a list with readlines(), 
and randomly access the list.


3) If the file is organized in "records" (groups of lines), then read 
and process a record at a time, rather than a line at a time.  A record 
might be 30 lines, and if you found something on the first line of the 
record, you want to modify the 26th line (that's your +25).  Anyway, 
it's possible to make a wrapper around file so that you can iterate 
through records, rather than lines.


HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Sequences of letter

2010-04-12 Thread Dave Angel


Or more readably:

from string import lowercase as letters
for c1 in letters:
for c2 in letters:
for c3 in letters:
   print c1+c2+c3


Yashwin Kanchan wrote:

Hi Juan

Hope you have got the correct picture now...

I just wanted to show you another way of doing the above thing in just 4
lines.

for i in range(65,91):
for j in range(65,91):
for k in range(65,91):
print chr(i)+chr(j)+chr(k),


On 12 April 2010 06:12, Juan Jose Del Toro  wrote:

  

Dear List;

I have embarked myself into learning Python, I have no programming
background other than some Shell scripts and modifying some programs in
Basic and PHP, but now I want to be able to program.

I have been reading Alan Gauld's Tutor which has been very useful and I've
also been watching Bucky Roberts (thenewboston) videos on youtube (I get
lost there quite often but have also been helpful).

So I started with an exercise to do sequences of letters, I wan to write a
program that could print out the suquence of letters from "aaa" all the way
to "zzz"  like this:
aaa
aab
aac
...
zzx
zzy
zzz

So far this is what I have:
letras =
["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","x","y","z"]
letra1 = 0
letra2 = 0
letra3 = 0
for i in letras:
for j in letras:
for k in letras:
print letras[letra1]+letras[letra2]+letras[letra3]
letra3=letra3+1
letra2=letra2+1
letra1=letra1+1

It goes all the way to aaz and then it gives me this error
Traceback (most recent call last):
 File "/home/administrador/programacion/python/letras2.py", line 8, in

print letras[letra1]+letras[letra2]+letras[letra3]
IndexError: list index out of range
Script terminated.

Am I even in the right path?
I guess I should look over creating a function or something like that
because when I run it I can't even use my computer no memory left

--
¡Saludos! / Greetings!
Juan José Del Toro M.
jdeltoro1...@gmail.com
Guadalajara, Jalisco MEXICO


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor





  

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Move all files to top-level directory

2010-04-12 Thread Dave Angel


Dotan Cohen wrote:

On 12 April 2010 20:12, Sander Sweers  wrote:
  

On 12 April 2010 18:28, Dotan Cohen  wrote:


However, it fails like this:
$ ./moveUp.py
Traceback (most recent call last):
 File "./moveUp.py", line 8, in 
   os.rename(f, currentDir)
TypeError: coercing to Unicode: need string or buffer, tuple found
  

os.rename needs the oldname and the new name of the file. os.walk
returns a tuple with 3 values and it errors out.




I see, thanks. So I was sending it four values apparently. I did not
understand the error message.

  
No, you're sending it two values:  a tuple, and a string.  It wants two 
strings.  Thus the error. If you had sent it four values, you'd have 
gotten a different error.


Actually, I will add a check that cwd !=HOME || $HOME/.bin as those
are the only likely places it might run by accident. Or maybe I'll
wrap it in Qt and add a confirm button.


  

os.walk returns you a tuple with the following values:
(the root folder, the folders in the root, the files in the root folder).

You can use tuple unpacking to split each one in separate values for
your loop. Like:

for root, folder, files in os.walk('your path):
  #do stuff




I did see that while googling, but did not understand it. Nice!


  

Judging from your next message, you still don't understand it.

It might be wise to only have this module print what it would do
instead of doing the actual move/rename so you can work out the bugs
first before it destroys your data.




I am testing on fake data, naturally.

  
Is your entire file system fake?  Perhaps you're running in a VM, and 
don't mind trashing it.


While debugging, you're much better off using prints than really moving 
files around.  You might be amazed how much damage a couple of minor 
bugs could cause.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Move all files to top-level directory

2010-04-12 Thread Dave Angel




Dotan Cohen wrote:

All right, I have gotten quite a bit closer, but Python is now
complaining about the directory not being empty:

✈dcl:test$ cat moveUp.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
currentDir =s.getcwd()

filesList =s.walk(currentDir)
for root, folder, file in filesList:
  

Why is the print below commented out?

for f in file:
toMove =oot + "/" + f
#print toMove
os.rename(toMove, currentDir)

✈dcl:test$ ./moveUp.py
Traceback (most recent call last):
  File "./moveUp.py", line 11, in 
os.rename(toMove, currentDir)
OSError: [Errno 39] Directory not empty


I am aware that the directory is not empty, nor should it be! How can
I override this?

Thanks!

  
Have you looked at the value of "currentDir" ? Is it in a form that's 
acceptible to os.rename() ? And how about toMove? Perhaps it has two 
slashes in a row in it. When combining directory paths, it's generally 
safer to use


os.path.join()

Next, you make no check whether "root" is the same as "currentDir". So 
if there are any files already in the top-level directory, you're trying 
to rename them to themselves.


I would also point out that your variable names are very confusing. 
"file" is a list of files, so why isn't it plural? Likewise "folders."


DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Move all files to top-level directory

2010-04-13 Thread Dave Angel


Dotan Cohen wrote:

Here is the revised version:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
currentDir = os.getcwd()
i = 1
filesList = os.walk(currentDir)
for rootDirs, folders, files in filesList:
  
Actual the first item in the tuple (returned by os.walk) is singular (a 
string), so I might call it rootDir.  Only the other two needed to be 
changed to plural to indicate that they were lists.

for f in files:
if (rootDirs!=currentDir):
toMove  = os.path.join(rootDirs, f)
print "--- "+str(i)
print toMove
newFilename = os.path.join(currentDir,f)
renameNumber = 1
while(os.path.exists(newFilename)):
print "- "+newFilename
newFilename = os.path.join(currentDir,f)+"_"+str(renameNumber)
renameNumber = renameNumber+1
print newFilename
i=i+1
os.rename(toMove, newFilename)

Now, features to add:
1) Remove empty directories. I think that os.removedirs will work here.
2) Prevent race conditions by performing the filename check during
write. For that I need to find a function that fails to write when the
file exists.
3) Confirmation button to prevent accidental runs in $HOME for
instance. Maybe add some other sanity checks. If anybody is still
reading, I would love to know what sanity checks would be wise to
perform.

Again, thanks to all who have helped.


  
Note that it's not just race conditions that can cause collisions.  You 
might have the same name in two distinct subdirectories, so they'll end 
up in the same place.  Which one wins depends on the OS you're running, 
I believe.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Loop comparison

2010-04-16 Thread Dave Angel

Christian Witts wrote:

Ark wrote:

Hi everyone.
A friend of mine suggested me to do the next experiment in python and 
Java.

It's a simple program to sum all the numbers from 0 to 10.

result = i = 0
while i < 10:
result += i
i += 1
print result

The time for this calculations was huge.  It took a long time to give
the result.  But, the corresponding program in Java takes less than 1
second to end.  And if in Java, we make a simple type check per cycle,
it does not take more than 10 seconds in the same machine.  I was not
expecting Python to be faster than Java, but it''s too slow.  Maybe
Java optimizes this case and Python doesn't.  Not sure about this.}

Thanks
ark
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Different methods and their relative benchmarks.  The last two 
functions are shortcuts for what you are trying to do, the last 
function 't5' corrects the mis-calculation 't4' has with odd numbers.
Remember, if you know a better way to do something you can always 
optimize yourself ;)

>>> def t1(upper_bounds):
...   start = time.time()
...   total = sum((x for x in xrange(upper_bounds)))
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t1(10)
Time taken: 213.830082178
45
>>> def t2(upper_bounds):
...   total = 0
...   start = time.time()
...   for x in xrange(upper_bounds):
... total += x
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t2(10)
Time taken: 171.760597944
45
>>> def t3(upper_bounds):
...   start = time.time()
...   total = sum(xrange(upper_bounds))
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t3(10)
Time taken: 133.12481904
45
>>> def t4(upper_bounds):
...   start = time.time()
...   mid = upper_bounds / 2
...   total = mid * upper_bounds - mid
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t4(10)
Time taken: 1.4066696167e-05
45
>>> def t5(upper_bounds):
...   start = time.time()
...   mid = upper_bounds / 2
...   if upper_bounds % 2:
... total = mid * upper_bounds
...   else:
... total = mid * upper_bounds - mid
...   end = time.time()
...   print 'Time taken: %s' % (end - start)
...   print total
...
>>> t5(10)
Time taken: 7.15255737305e-06
45
>>> t3(1999)
Time taken: 0.003816121
1997001
>>> t4(1999)
Time taken: 3.09944152832e-06
1996002
>>> t5(1999)
Time taken: 3.09944152832e-06
1997001

A simpler formula is simply
   upper_bounds * (upper_bounds-1) / 2

No check needed for even/odd.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] QUESTION REGARDING STATUS OF MY SUBSCRIPTION FW: Auto-response for your message to the "Tutor" mailing list

2010-04-16 Thread Dave Angel


Peter Meagher wrote:

GREETINGS,

THIS EMAIL WOULD INDICATE THAT I AM ON THE SUBSCRIPTION
LIST.

HOWEVER, I GOT ANOTHER EMAIL, THAT CAME IN AT PRECISELY THE
SAME TIME AS THE ORIGINAL MESSAGE THAT I AM FORWARDING YOU.
THAT INDICATES THAT THERE WAS AN ISSUE ADDING ME TO THE
LIST. I'VE PASTED IT IN THE BLOCK OF TEXT BELOW, BUT ABOVE
THE EMAIL THAT I AM FORWARDING YOU.

THANK YOU FOR YOUR ATTENTION.
Peter Meagher



If you do not wish to be subscribed to this list, please

simply
disregard this message.  If you think you are being
maliciously
subscribed to the list, or have any other questions,
send
them to
tutor-ow...@python.org.


  
That explains where you go with subscription questions.  The address is 
NOT the same as the one used for posting on the list.  I suspect you 
didn't correctly reply to the original message.


Your other message is an independent point.  It has nothing to do with 
whether you're subscribed or not, but simply is an acknowledgement that 
you're a new poster to the list, and includes some suggestions.  In 
fact, I get that message sometimes, even though I was 3rd highest poster 
here last year.  It's perfectly legal to post without being a 
subscriber, as you could be browing the messages online.


BTW, all upper-case is considered shouting.  It makes a message much 
harder to read, and more likely to be ignored.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Loop comparison

2010-04-16 Thread Dave Angel


ALAN GAULD wrote:
  
The precalculation optimisations are 
taking place.  If you pass it an argument to use for the upper limit of the 
sequence the calculation time shoots up.



I'm still confused about when the addition takes place. 
Surely the compiler has to do the addition, so it should be slower?

I assume you have to run the posted code through cython
prior to running it in Python?

You can probably tell that I've never used Cython! :-)

Alan G.

  
I've never used Cython either, but I'd guess that it's the C compiler 
doing the extreme optimizing.  If all the code, including the loop 
parameters, are local, non-volatile, and known at compile time, the 
compile could do the arithmetic at compile time, and just store a result 
likeres = 42;


Or it could notice that there's no I/O done, so that the program has 
null effect.  And optimize the whole thing into a "sys.exit()"


I don't know if any compiler does that level of optimizing, but it's 
certainly a possibility.  And such optimizations might not be legitimate 
in stock Python (without type declarations and other assumptions), 
because of the possibility of other code changing the type of globals, 
or overriding various special functions.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Loop comparison

2010-04-17 Thread Dave Angel


Alan Gauld wrote:

"Lie Ryan"  wrote

A friend of mine suggested me to do the next experiment in python 
and Java.

It's a simple program to sum all the numbers from 0 to 10.

result = i = 0
while i < 10:
result += i
i += 1
print result



Are you sure you're not causing Java to overflow here? In Java,
Arithmetic Overflow do not cause an Exception, your int will simply wrap
to the negative side.


Thats why I asked if he got a float number back.
I never thought of it just wrapping, I assumed it would convert to 
floats.


Now that would be truly amusing.
If Java gives you the wrong answer much faster than Python gives the 
right one, which is best in that scenario?! :-)


Alan G.

It's been years, but I believe Java ints are 64 bits, on a 32bit 
implementation.  Just like Java strings are all unicode.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python root.

2010-04-18 Thread Dave Angel


Aidas wrote:

Hello.
In here 
http://mail.python.org/pipermail/tutor/2001-February/003385.html You 
had written how to ger root in python. The way is: "from math import 
sqrtprint sqrt( 49 )".


I noticed that if I write just "print sqrt(49)" I get nothing. 

I don't get "nothing," I get an error message.  In particular I get:

Traceback (most recent call last):
File "", line 1, in 
NameError: name 'sqrt' is not defined

So why I need to write "from math import sqrt" instead of write just 
"print sqrt( 49 )"?


P.S. Sorry about english-I'm lithuanian. :)

As the message says, "sqrt" is not defined in the language.  It's 
included in one of the library modules.  Whenever you need code from an 
external module, whether that module is part of the standard Python 
library or something you wrote, or even a third-party library, you have 
to import it before you can use it.  The default method of importing is:


import math
print math.sqrt(49)

Where the prefix qualifer on sqrt means to run the sqrt() specifically 
from the math module.


When a single function from a particular library module is needed many 
times, it's frequently useful to use the alternate import form:


from math import sqrt

which does two things:

import math
sqrt = math.sqrt

The second line basically gives you an alias, or short name, for the 
function from that module.


HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] the binary math "wall"

2010-04-20 Thread Dave Angel

Lowell Tackett wrote:

I'm running headlong into the dilemma of binary math representation, with
game-ending consequences, e.g.:

0.15

0.14999

Obviously, any attempts to manipulate this value, under the misguided assumption that it
is truly "0.15" are ill-advised, with inevitable bad results.

the particular problem I'm attempting to corral is thus:

math.modf(18.15)

(0.14858, 18.0)

with some intermediate scrunching, the above snippet morphs to:

(math.modf(math.modf(18.15)[0]*100)[0])/.6

1.4298

The last line should be zero, and needs to be for me to continue this algorithm.

Any of Python's help-aids that I apply to sort things out, such as formatting (%), or modules like
"decimal" do nothing more than "powder up" the display for visual consumption (turning it
into a string). The underlying float value remains "corrupted", and any attempt to continue with
the math adapts and re-incorporates the corruption.

What I'm shooting for, by the way, is an algorithm that converts a deg/min/sec
formatted number to decimal degrees. It [mostly] worked, until I stumbled upon
the peculiar cases of 15 minutes and/or 45 minutes, which exposed the flaw.

What to do? I dunno. I'm throwing up my hands, and appealing to the "Council".

(As an [unconnected] aside, I have submitted this query as best I know how, using plain text and
the "tu...@..." address. There is something that either I, or my yahoo.com mailer *or
both* doesn't quite "get" about these mailings. But, I simply do my best, following
advice I've been offered via this forum. Hope this --mostly-- works.)

>From the virtual desk of Lowell Tackett

One of the cases you mention is 1.666The decimal package won't
help that at all. What the decimal package does for you is two-fold:

1) it means that what displays is exactly what's there
2) it means that errors happen in the same places where someone
doing it "by hand" will encounter.

But if you literally have to support arbitrary rational values
(denominators other than 2 or 5), you would need to do fractions, either
by explicitly keeping sets of ints, or by using a fractions library.
And if you have to support arbitrary arithmetic, there's no answer other
than hard analysis.

This is not a Python-specific problem. Floating point has had such
issues in every language I've dealt with since 1967, when I first
learned Fortran. If you compare two values, the simplest mechanism is

abs(a-b) < delta

where you have to be clever about what small value to use for delta.

If all values are made up of degrees/minutes/seconds, and seconds is a
whole number, then store values as num-seconds, and do all arithmetic on
those values. Only convert them back to deg/min/sec upon output.

DaveA

___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel




Lowell Tackett wrote:

--- On Tue, 4/20/10, Steven D'Aprano  wrote:

  

From: Steven D'Aprano 




The simplest, roughest way to fix these sorts of problems
(at the risk 
of creating *other* problems!) is to hit them with a

hammer:



round(18.15*100) == 1815
  

True



Interestingly, this is the [above] result when I tried entered the same snippet:

Python 2.5.1 (r251:54863, Oct 14 2007, 12:51:35)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
  

round(18.15)*100 == 1815


False
  



But you typed it differently than Steven.  He had   round(18.15*100), 
and you used round(18.15)*100


Very different.   His point boils down to comparing integers, and when 
you have dubious values, round them to an integer before comparing.  I 
have my doubts, since in this case it would lead to bigger sloppiness 
than necessary.


round(18.154 *100) == 1815

probably isn't what you'd want.

So let me ask again, are all angles a whole number of seconds?  Or can 
you make some assumption about how accurate they need to be when first 
input (like tenths of a second, or whatever)?  If so use an integer as 
follows:


val =  rounddegrees*60)+minutes)*60) + seconds)*10)

The 10 above is assuming that tenths of a second are your quantization.

HTH
DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel


Lowell Tackett wrote:
From the virtual desk of Lowell Tackett  




--- On Wed, 4/21/10, Dave Angel  wrote:

  

From: Dave Angel 
Subject: Re: [Tutor] the binary math "wall"
To: "Lowell Tackett" 
Cc: tutor@python.org, "Steven D'Aprano" 
Date: Wednesday, April 21, 2010, 6:46 AM


Lowell Tackett wrote:


--- On Tue, 4/20/10, Steven D'Aprano 
  

wrote:

  
  

From: Steven D'Aprano 


The simplest, roughest way...hit them with a
hammer:



round(18.15*100) == 1815
  
  
   


True



...when I tried...:

Python 2.5.1 (r251:54863, Oct 14 2007, 12:51:35)
[GCC 3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)] on linux2
Type "help", "copyright", "credits" or "license" for
  

more information.

  
  

round(18.15)*100 == 1815



False
  

  

But you typed it differently than Steven.  He
had   round(18.15*100), and you used
round(18.15)*100



As soon as I'd posted my answer I realized this mistake.

  

Very different.   His point boils down to
comparing integers, and when you have dubious values, round
them to an integer before comparing.  I have my doubts,
since in this case it would lead to bigger sloppiness than
necessary.

round(18.154 *100) == 1815

probably isn't what you'd want.

So let me ask again, are all angles a whole number of
seconds?  Or can you make some assumption about how
accurate they need to be when first input (like tenths of a
second, or whatever)?  If so use an integer as
follows:

val =  rounddegrees*60)+minutes)*60) +
seconds)*10)

The 10 above is assuming that tenths of a second are your
quantization.

HTH
DaveA





Recalling (from a brief foray into college Chem.) that a result could not be 
displayed with precision greater than the least precise component that bore 
[the result].  So, yes, I could accept my input as the arbitrator of accuracy.

A scenario:

Calculating the coordinates of a forward station from a given base station 
would require [perhaps] the bearing (an angle from north, say) and distance 
from hither to there.  Calculating the north coordinate would set up this 
relationship, e.g.:

cos(3° 22' 49.6") x 415.9207'(Hyp) = adjacent side(North)

My first requirement, and this is the struggle I (we) are now engaged in, is to 
convert my bearing angle (3° 22' 49.6") to decimal degrees, such that I can 
assign its' proper cosine value.  Now, I am multiplying these two very refined 
values (yes, the distance really is honed down to 10,000'ths of a foot-that's normal 
in surveying data); within the bowels of the computer's blackboard scratch-pad, I 
cannot allow errors to evolve and emerge.

Were I to accumulate many of these "legs" into perhaps a 15 mile 
traverse-accumulating little computer errors along the way-the end result could be 
catastrophically wrong.

(Recall that in the great India Survey in the 1800's, Waugh got the elevation of Mt. 
Everest wrong by almost 30' feet for just this exact same reason.)  In surveying, we have 
a saying, "Measure with a micrometer, mark with chalk, cut with an axe".  
Accuracy [in math] is a sacred tenet.

So, I am setting my self very high standards of accuracy, simply because those 
are the standards imposed by the project I am adapting, and I can require 
nothing less of my finished project.

  
If you're trying to be accurate when calling cos, why are you using 
degrees?  The cosine function takes an angle in radians.  So what you 
need is a method to convert from deg/min/sec to radians.  And once you 
have to call trig, you can throw out all the other nonsense about 
getting exact values.  Trig functions don't take arbitrary number 
units.  They don't take decimals, and they don't take fractions.  They 
take double-precision floats.


Perhaps you don't realize the amount of this quantization error we've 
been talking about.  The double type is 64bits in size, and contains the 
equivalent of about 18 decimal digits of precision.  (Assuming common 
modern architectures, of course)



Your angle is specified to about 5 digits of precision, and the distance 
to 7.  So it would take a VERY large number of typical calculations for 
errors in the 18th place to accumulate far enough to affect those.


The real problem, and one that we can't solve for you, and neither can 
Python, is that it's easy to do calculations starting with 8 digits of 
accuracy, and the result be only useful to 3 or 4.  For example, simply 
subtract two very close numbers, and use the result as though it were 
meaningful.


I once had a real customer send us a letter asking about the math 
precision of a calculation he was doing.  I had written the math 
microcode of the machine he was using (from add and subtract, up to 
trigs and logs, I

Re: [Tutor] the binary math "wall"

2010-04-21 Thread Dave Angel


Steven D'Aprano wrote:

On Thu, 22 Apr 2010 01:37:35 am Lowell Tackett wrote:



Were I to accumulate many of these "legs" into perhaps a 15 mile
traverse-accumulating little computer errors along the way-the end
result could be catastrophically wrong.



YES!!! 

And just by being aware of this potential problem, you are better off 
than 90% of programmers who are blithely unaware that floats are not 
real numbers.



  
Absolutely.  But "catastrophically wrong" has to be defined, and 
analyzed.  If each of these measurements is of 100 feet, measured to an 
accuracy of .0001 feet, and you add up the measurements in Python 
floats, you'll be adding 750 measurements, and your human error could 
accumulate to as much as .07 feet.


The same 750 floating point ads, each to 15 digits of quantization 
accuracy (thanks for the correction, it isn't 18) will give a maximum 
"computer error" of  maybe .1 feet.  The human error is much 
larger than the computer error.


No results can be counted on without some analysis of both sources of 
error.  Occasionally, the "computer error" will exceed the human, and 
that depends on the calculations you do on your measurements.


HTH,
DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sys.path and the path order

2010-04-23 Thread Dave Angel


Garry Willgoose wrote:
My 
question is so simple I'm surprised I can't find an answer somewhere. 
I'm interested if I can rely on the order of the directories in the 
sys.path list. When I'm running a file from the comand line like


python tellusim.py

The string in entry sys.path[0] appears to be the full path to the 
location of the file I'm running in this case tellusim ... i.e. it 
looks like '/Volumes/scone2/codes/tellusim0006'. This is good because 
for my code I need to create a search path for modules that is 
relative to the location of this file irrespective of the location I'm 
in when I invoke the script file (i.e. I could be in /Volumes/scone2 
and invoke it by 'python codes/tellusim0006/tellusim.py').


The question is can I rely on entry [0] in sys.path always being the 
directory in which the original file resides (& across linux, OSX and 
Windows)? If not what is the reliable way of getting that information?



As Steven says, that's how it's documented.

There is another way, one that I like better.  Each module, including 
the startup script, has an attribute called __file__, which is the path 
to the source file of that module.


Then I'd use os.path.abspath(), and os.path.dirname() to turn that into 
an absolute path to the directory.


The only exception I know of to __file__ usefulness is modules that are 
loaded from zip files.  I don't know if the initial script can come from 
a zip file, but if it does, the question changes.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Hi everybody stuck on some error need help please thank you!!

2010-04-24 Thread Dave Angel

(Don't top-post.  Either put your remarks immediately after the part 
they reference, or at the end of the message.  Otherwise, everything's 
thoroughly out of order.)


Marco Rompré wrote:

I tried to enter model = Modele (nom_fichier) but it still does not work.
  
You didn't define the global nom_fichier till after that line.  In 
general, while you're learning, please avoid using the same names for 
global values, class attributes, instance attributes, function parameter 
names, and local variables.   The rules for what a name means changes 
depending on where the name is used.


On Fri, Apr 23, 2010 at 11:22 PM, Steven D'Aprano wrote:

  

On Sat, 24 Apr 2010 01:07:11 pm Marco Rompré wrote:



Here's my code:
  

[...]


class Modele:
"""
La definition d'un modele avec les magasins.
"""
def __init__(self, nom_fichier, magasins =[]):
self.nom_fichier = nom_fichier
self.magasins = magasins
  

[...]


if __name__ == '__main__':
modele = Modele()
  
This is where you got the error, because there's a required argument, 
for parameter nom_fichier.  So you could use

   modele = Modele("thefile.txt")

nom_fichier = "magasinmodele.txt"
  
I'd call this something else, like  g_nom_fichier.  While you're 
learning, you don't want to get confused between the multiple names that 
look the same.

modele.charger(nom_fichier)
if modele.vide():
modele.initialiser(nom_fichier)
modele.afficher()

And here's my error :

Traceback (most recent call last):
  File "F:\School\University\Session 4\Programmation
SIO\magasingolfmodele.py", line 187, in 
modele = Modele()
TypeError: __init__() takes at least 2 arguments (1 given)
  

You define Modele to require a nom_fichier argument, but then you try to
call it with no nom_fuchier.




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Binary search question

2010-04-25 Thread Dave Angel


Lie Ryan wrote:

On 04/24/10 23:39, Robert Berman wrote:
  

-Original Message-
From: tutor-bounces+bermanrl=cfl.rr@python.org [mailto:tutor-
bounces+bermanrl=cfl.rr@python.org] On Behalf Of Alan Gauld
Sent: Friday, April 23, 2010 7:41 PM
To: tutor@python.org
Subject: Re: [Tutor] Binary search question

"Emile van Sebille"  wrote

  

   BIG SNIP


And even at 1000 entries, the list creation slowed right
down - about 10 seconds, but the searches even for "-5" were
still around a second.

So 'in' looks pretty effective to me!
  

Now that is most impressive.




But that is with the assumption that comparison is very cheap. If you're
searching inside an object with more complex comparison, say, 0.01
second per comparison, then with a list of 10 000 000 items, with 'in'
you will need on *average* 5 000 000 comparisons which is 50 000 seconds
compared to *worst-case* 24 comparisons with bisect which is 0.24 seconds.

Now, I say that's 208333 times difference, most impressive indeed.


  


The ratio doesn't change with a slow comparison, only the magnitude.

And if you have ten million objects that are complex enough to take .01 
secs per comparison, chances are it took a day or two to load them up 
into your list.  Most likely you won't be using a list anyway, but a 
database, so you don't have to reload them each time you start the program.



It's easy to come up with situations in which each of these solutions is 
better than the other.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] For loop breaking string methods

2010-04-26 Thread Dave Angel


C M Caine wrote:

Thank you for the clarification, bob.

For any future readers of this thread I include this link[1] to effbot's
guide on lists, which I probably should have already read.

My intention now is to modify list contents in the following fashion:

for index, value in enumerate(L):
L[0] = some_func(value)

Is this the standard method?

[1]: http://effbot.org/zone/python-list.htm

Colin Caine

  

Almost.   You should have said

   L[index] = some_func(value)

The way you had it, it would only replace the zeroth item of the list.

Note also that if you insert or delete from the list while you're 
looping, you can get undefined results.  That's one reason it's common 
to build a new loop, and just assign it back when done.  Example would 
be the list comprehension you showed earlier.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel


Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire thing 
into memory with readlines().  And then you do confusing work to split 
it apart again, into one list element per paper.   And for a while 
there, you have three copies of the entire text.  You're keeping two 
copies, in the form of alltext and papers. 

You print out the len(papers).  What do you see there?  Is it correctly 
87 ?  If it's not, you have to fix the problem here, before even going on.


  I want to prepare a file with a row for each paper and a column for 
each term. The cells would contain the count of a term in that paper.  
In the original application in the 1950's 30 single word terms were 
used. I can now use NoteTab to get a list of all the 8708 separate 
words in allWords.txt. I can then use that data in statistical 
exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am using 
to learn PYTHON. The comments starting with "later" are things I will 
try to do to make this more useful. I am getting one step at at time 
to work


It works when the number of terms in the term list is small e.g., 10.  
I get a file with the correct number of rows (87) and count columns 
(10) in termcounts.txt. The termcounts.txt file is not correct when I 
have a larger number of terms, e.g., 100. I get a file with only 40 
rows and the correct number of columns.  With 8700 terms I get only 40 
rows I need to be able to have about 8700 terms. (If this were FORTRAN 
I would say that the subscript indices were getting scrambled.)  (As I 
develop this I would like to be open-ended with the numbers of input 
papers and open ended with the number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v 
for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a string 
to underscore // replace (ltrtim(rtrim(varname))," ","_")   replace 
any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for s in 
counts]) + "\n")



Art

If you're memory limited, you really should sequence through the files, 
only loading one at a time, rather than all at once.  It's no harder.  
Use dirlist() to make a list of files, then your loop becomes something 
like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least close 
each file when done.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel


Art Kendall wrote:



On 5/6/2010 11:14 AM, Dave Angel wrote:

Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire 
thing into memory with readlines().  And then you do confusing work 
to split it apart again, into one list element per paper.   And for a 
while there, you have three copies of the entire text.  You're 
keeping two copies, in the form of alltext and papers.
You print out the len(papers).  What do you see there?  Is it 
correctly 87 ?  If it's not, you have to fix the problem here, before 
even going on.


  I want to prepare a file with a row for each paper and a column 
for each term. The cells would contain the count of a term in that 
paper.  In the original application in the 1950's 30 single word 
terms were used. I can now use NoteTab to get a list of all the 8708 
separate words in allWords.txt. I can then use that data in 
statistical exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am 
using to learn PYTHON. The comments starting with "later" are things 
I will try to do to make this more useful. I am getting one step at 
at time to work


It works when the number of terms in the term list is small e.g., 
10.  I get a file with the correct number of rows (87) and count 
columns (10) in termcounts.txt. The termcounts.txt file is not 
correct when I have a larger number of terms, e.g., 100. I get a 
file with only 40 rows and the correct number of columns.  With 8700 
terms I get only 40 rows I need to be able to have about 8700 terms. 
(If this were FORTRAN I would say that the subscript indices were 
getting scrambled.)  (As I develop this I would like to be 
open-ended with the numbers of input papers and open ended with the 
number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v 
for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a 
string to underscore // replace (ltrtim(rtrim(varname))," ","_")   
replace any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for s 
in counts]) + "\n")



Art

If you're memory limited, you really should sequence through the 
files, only loading one at a time, rather than all at once.  It's no 
harder.  Use dirlist() to make a list of files, then your loop 
becomes something like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least 
close each file when done.


DaveA




Thank you for getting back to me. I am trying to generalize a process 
that 50 years ago used 30 terms on the whole file and I am using the 
task of generalizing the process to learn python.   In the post I sent 
there were comments to myself about things that I would want to learn 
about.  One of the first is to learn about processing all files in a 
folder, so your reply will be very helpful.  It seems that dirlist() 
should allow me to includ

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-06 Thread Dave Angel


Art Kendall wrote:



On 5/6/2010 1:51 PM, Dave Angel wrote:

Art Kendall wrote:



On 5/6/2010 11:14 AM, Dave Angel wrote:

Art Kendall wrote:
I am running Windows 7 64bit Home premium. with quad cpus and 8G 
memory.   I am using Python 2.6.2.


I have all the Federalist Papers concatenated into one .txt file.
Which is how big?  Currently you (unnecessarily) load the entire 
thing into memory with readlines().  And then you do confusing work 
to split it apart again, into one list element per paper.   And for 
a while there, you have three copies of the entire text.  You're 
keeping two copies, in the form of alltext and papers.
You print out the len(papers).  What do you see there?  Is it 
correctly 87 ?  If it's not, you have to fix the problem here, 
before even going on.


  I want to prepare a file with a row for each paper and a column 
for each term. The cells would contain the count of a term in that 
paper.  In the original application in the 1950's 30 single word 
terms were used. I can now use NoteTab to get a list of all the 
8708 separate words in allWords.txt. I can then use that data in 
statistical exploration of the set of texts.


I have the python program(?) syntax(?) script(?) below that I am 
using to learn PYTHON. The comments starting with "later" are 
things I will try to do to make this more useful. I am getting one 
step at at time to work


It works when the number of terms in the term list is small e.g., 
10.  I get a file with the correct number of rows (87) and count 
columns (10) in termcounts.txt. The termcounts.txt file is not 
correct when I have a larger number of terms, e.g., 100. I get a 
file with only 40 rows and the correct number of columns.  With 
8700 terms I get only 40 rows I need to be able to have about 8700 
terms. (If this were FORTRAN I would say that the subscript 
indices were getting scrambled.)  (As I develop this I would like 
to be open-ended with the numbers of input papers and open ended 
with the number of words/terms.)




# word counts: Federalist papers

import re, textwrap
# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file("C:/Users/Art/Desktop/fed/feder16v3.txt").readlines()
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print len(papers)

countsfile = file("C:/Users/Art/desktop/fed/TermCounts.txt", "w")
syntaxfile = file("C:/Users/Art/desktop/fed/TermCounts.sps", "w")
# later create a python program that extracts all words instead of 
using NoteTab

termfile   = open("C:/Users/Art/Desktop/fed/allWords.txt")
termlist = termfile.readlines()
termlist = [item.rstrip("\n") for item in termlist]
print len(termlist)
# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 
'not', 'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or 
v for v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a 
string to underscore // replace (ltrtim(rtrim(varname))," ","_")   
replace any special characters with @ in variable names



for p in range(len(papers)):

range(len()) is un-pythonic.  Simply do
for paper in papers:

and of course use paper below instead of papers[p]

   counts = []
   for t in termlist:
  counts.append(len(re.findall(r"\b" + t + r"\b", papers[p], 
re.IGNORECASE)))

   if sum(counts) > 0:
  papernum = re.search("[0-9]+", papers[p]).group(0)
  countsfile.write(str(papernum) + " " + " ".join([str(s) for 
s in counts]) + "\n")



Art

If you're memory limited, you really should sequence through the 
files, only loading one at a time, rather than all at once.  It's 
no harder.  Use dirlist() to make a list of files, then your loop 
becomes something like:


for  infile in filelist:
 paper = " ".join(open(infile, "r").readlines())

Naturally, to do it right, you should usewith...  Or at least 
close each file when done.


DaveA




Thank you for getting back to me. I am trying to generalize a 
process that 50 years ago used 30 terms on the whole file and I am 
using the task of generalizing the process to learn python.   In the 
post I sent there were comments to myself about things that I would 
want to learn about.  One of the first is to learn about processing 
all files in a folder, so your reply will be very h

Re: [Tutor] Is the difference in outputs with different size input lists due to limits on memory with PYTHON?

2010-05-07 Thread Dave Angel


Art Kendall wrote:



On 5/6/2010 8:52 PM, Dave Angel wrote:




I got my own copy of the papers, at 
http://thomas.loc.gov/home/histdox/fedpaper.txt


I copied your code, and added logic to it to initialize termlist from 
the actual file.  And it does complete the output file at 83 lines, 
approx 17000 columns per line (because most counts are one digit).  
It takes quite a while, and perhaps you weren't waiting for it to 
complete.  I'd suggest either adding a print to the loop, showing the 
count, and/or adding a line that prints "done" after the loop 
terminates normally.


I watched memory usage, and as expected, it didn't get very high.  
There are things you need to redesign, however.  One is that all the 
punctuation and digits and such need to be converted to spaces.



DaveA




Thank you for going the extra mile.

I obtained my copy before I retired in 2001 and there are some 
differences.  In the current copy from the LOC papers 7, 63, and 81 
start with "FEDERALIST." (an extra period).  That explains why you 
have 83. There also some comments such as attributed author.  After 
the weekend, I'll do a file compare and see differences in more detail.


Please email me your version of the code.  I'll try it as is.  Then 
I'll put in a counter, have it print the count and paper number, and a 
'done' message.


As a check after reading in the counts, I'll include the counts from 
NoteTab and see if these counts sum to those from NoteTab.


I'll use SPSS to create a version of the .txt file with punctuation 
and numerals changed to spaces and try using that as the corpus.   
Then I'll try to create a similar file with Python.


Art

As long as you realize this is very rough.  I just wanted to prove there 
wasn't anything fundamentally wrong with your approach.  But there's 
still lots to do, especially with regards to cleaning up the text before 
and between the papers.  Anyway, here it is.


#!/usr/bin/env python

sourcedir = "data/"
outputdir = "results/"


# word counts: Federalist papers
import sys, os
import re, textwrap
#Create the output directory if it doesn't exist
if not os.path.exists(outputdir):
   os.makedirs(outputdir)

# read the combined file and split into individual papers
# later create a new version that deals with all files in a folder 
rather than having papers concatenated

alltext = file(sourcedir + "feder16.txt").readlines()

filtered = " ".join(alltext).lower()
for ch in ('" ' + ". , ' * - ( ) = @ [ ] ; . ` 1 2 3 4 5 6 7 8 9 0 > : / 
?").split():

   filtered = filtered.replace(ch, " ")
#todo:   make a better filter, such as keeping only letters, rather than 
replacing

#   specific characters

words = filtered.split()
print "raw word count is", len(words)

wordset = set(words)
print "wordset reduces it from/to", len(words), len(wordset)
#eliminate words shorter than 4 characters
words = sorted([word for word in wordset if len(word)>3])
del wordset#free space of wordset
print "Eliminating words under 4 characters reduces it to", len(words)

#print the first 50
for word in words[:50]:
   print word



print "alltext is size", len(alltext)
papers= re.split(r'FEDERALIST No\.'," ".join(alltext))
print "Number of detected papers is ", len(papers)

#print first 50 characters of each, so we can see why some of them are 
missed

#   by our regex above
for index, paper in enumerate(papers):
   print index, "***", paper[:50]


countsfile = file(outputdir + "TermCounts.txt", "w")
syntaxfile = file(outputdir + "TermCounts.sps", "w")
# later create a python program that extracts all words instead of using 
NoteTab

#termfile   = open("allWords.txt")
#termlist = termfile.readlines()
#termlist = [item.rstrip("\n") for item in termlist]
#print "termlist is ", len(termlist)

termlist = words

# check for SPSS reserved words
varnames = textwrap.wrap(" ".join([v.lower() in ['and', 'or', 'not', 
'eq', 'ge',
'gt', 'le', 'lt', 'ne', 'all', 'by', 'to','with'] and (v+"_r") or v for 
v in termlist]))
syntaxfile.write("data list file= 
'c:/users/Art/desktop/fed/termcounts.txt' free/docnumber\n")

syntaxfile.writelines([v + "\n" for v in varnames])
syntaxfile.write(".\n")
# before using the syntax manually replace spaces internal to a string 
to underscore // replace (ltrtim(rtrim(varname))," ","_")   replace any 
special characters with @ in variable names



for p, paper in enumerate(papers):
  counts = []
  for t in termlist:
 counts.append(len(re.findall(r"\b" + t + r"\b", paper, 
re.IGNORECASE)))

  print p, counts[:5]
  if sum(counts) > 0:
 papernum = re.search("[0-9]+", papers[p]).group(0)
 countsfile.write(str(papernum) + " " + " ".join([str(s) for s in 
counts]) + "\n")


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] (no subject)

2010-05-11 Thread Dave Angel


Sivapathasuntha Aruliah wrote:

Hi
I am learning Python. When I tried to run any of the program for example 
csv2html1_ans.py it displays the following message. This error is coming 
on both Python24 & Python 31. That is whether i give the any one of the 
following command


COMMAND GIVEN
1.C:\python24\python.exe C:\py3eg\quadratic.py
2.C:\python31\python.exe C:\py3eg\quadratic.py

A message below appears with the program name. Please advice me how to get 
over from this issue

ERROR MESSAGE
command  C:\py3eg\csv2html1_ans.py is not a valid Win32 application

Regards,
Siva
Test Equipment Engineering
Amkor Technology (S) Pte Ltd
1 Kaki Bukit View
#03-28 TechView Building
Singapore 415941
Tel: (65) 6347 1131
Fax: (65) 6746 4815
  
Please copy and paste the actual contents of your DOS box, rather than 
paraphrasing.  COMMAND hasn't been the normal shell name since Win95 
days.  You can't use numbers in front of commands in any shell I've 
used.  The error message refers to a different file than anything you 
specified in your commands.


What OS are you using?

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help required to count no of lines that are until 1000 characters

2010-05-11 Thread Dave Angel


ramya natarajan wrote:

Hello,

I am very  beginner to programming, I  got task to Write a loop that 
reads
each line of a file and counts the number of lines that are read until 
the

total length of the lines is 1,000 characters. I have to read lines from
files exactly upto 1000 characters.

Here is my code:
 I created file under /tmp/new.txt  which has 100 lines and 2700 
characters
, I wrote code will read exactly 1000 characters and count lines upto 
those

characters.But the problem is its reading entire line and not stopping
excatly in 1000 characters. Can some one help what mistake i am doing 
here?.


   log = open('/tmp/new.txt','r')
   lines,char = 0,0
   for line in log.readlines():
while char < 1000 :
for ch in line :
 char += len(ch)
lines += 1
  print char , lines
  1026 , 38    Its counting entire line  instead of character upto 
1000

-- can some one point out what mistake am i doing here , where its not
stopping at 1000 . I am reading only char by car

My new.txt -- cotains content like
this is my new number\n

Can some one please help. I spent hours and hours to find issue but i 
am not

able to figure out, Any help would be greatly appreciated.
Thank you
Ramya

  
The problem is ill-specified (contradictory).  It'd probably be better 
to give the exact wording of the assignment.


If you read each line of the file, then it would only be a coincidence 
if you read exactly 1000 characters, as most likely one of those lines 
will overlap the 1000 byte boundary.



But you have a serious bug in your code, that nobody in the first five 
responses has addressed.  That while loop will loop over the first line 
repeatedly, till it reaches or exceeds 1000, regardless of the length of 
subsequent lines.  So it really just divides 1000 by the length of that 
first line.  Notice that the lines += 1 will execute multiple times for 
a single iteration of the for loop.


Second, once 1000 is reached, the for loop does not quit.  So it will 
read the rest of the file, regardless of how big the file is.  It just 
stops adding to lines or char, since char reached 1000 on the first line.


The simplest change to your code which might accomplish what you want is 
to put the whole thing inside a function, and return from the function 
when the goal is reached.  So instead of a while loop, you need some 
form of if test.  See if you can run with that.  Remember that return 
can return a tuple (pair of numbers).


There are plenty of other optimizations and approaches, but you'll learn 
best by incrementally fixing what you already have.


DaveA





___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] (no subject)

2010-05-11 Thread Dave Angel

(1. Please don't top-post.  It gets everything out of sequence, and is 
the wrong convention for this forum
2. Be sure and do a reply-all, so that the message goes to the forum.  
I'm not here to give private advice.
3. Use your editor's reply-quoting so that we can tell who wrote which 
parts.  Normally, you'll see that as either a leading ">" character or 
as a "!" character.  And one can tell which parts were written by whom 
by counting the number of those at the beginning of each line)


For my real response, see the end of the message, where it belongs.

Sivapathasuntha Aruliah wrote:

Dave
Thank you very much for your response. I think I have problem with both 
Python23 & Python31. Please help.


Python23 : The program works but programs written by Mark Summerfield in 
his book Programming in Python3 does not work.

Python 31: When I run this program it says in the pop up window "
C:\py3eg\csv2html1_ans.py is not a valid Win32 application" and on the the 
dos box it says Access is denied.

Below is the dos box contents


C:\>cd python31

C:\Python31>python C:\py3eg\quadratic.py
Access is denied.

C:\Python31>python C:\py3eg\quadratic.py
Access is denied.

C:\Python31>python C:\py3eg\hello.py
Access is denied.

C:\Python31>python.exe C:\py3eg\hello.py
Access is denied.

C:\Python31>cd..

C:\>cd python23

C:\Python23>python.exe C:\py3eg\hello.py
('Hello', 'World!')

C:\Python23>python.exe C:\py3eg\print_unicode.py
Traceback (most recent call last):
  File "C:\py3eg\print_unicode.py", line 30, in ?
print_unicode_table(word)
NameError: name 'print_unicode_table' is not defined

C:\Python23>python.exe C:\py3eg\quadratic.py
  File "C:\py3eg\quadratic.py", line 14
except ValueError as err:
   ^
SyntaxError: invalid syntax




Regards,
Siva
Test Equipment Engineering
Amkor Technology (S) Pte Ltd
1 Kaki Bukit View
#03-28 TechView Building
Singapore 415941
Tel: (65) 6347 1131
Fax: (65) 6746 4815



Dave Angel 


05/12/2010 09:50 AM


To
Sivapathasuntha Aruliah/S1/a...@amkor
cc
tutor@python.org
Subject
Re: [Tutor] (no subject)








Sivapathasuntha Aruliah wrote:
  

Hi
I am learning Python. When I tried to run any of the program for example 



  
csv2html1_ans.py it displays the following message. This error is coming 



  
on both Python24 & Python 31. That is whether i give the any one of the 
following command


COMMAND GIVEN
1.C:\python24\python.exe C:\py3eg\quadratic.py
2.C:\python31\python.exe C:\py3eg\quadratic.py

A message below appears with the program name. Please advice me how to 

get 
  

over from this issue
ERROR MESSAGE
command  C:\py3eg\csv2html1_ans.py is not a valid Win32 application

Regards,
Siva
Test Equipment Engineering
Amkor Technology (S) Pte Ltd
1 Kaki Bukit View
#03-28 TechView Building
Singapore 415941
Tel: (65) 6347 1131
Fax: (65) 6746 4815


Please copy and paste the actual contents of your DOS box, rather than 
paraphrasing.  COMMAND hasn't been the normal shell name since Win95 
days.  You can't use numbers in front of commands in any shell I've 
used.  The error message refers to a different file than anything you 
specified in your commands.


What OS are you using?

DaveA



  

Again, what OS are you using?

I have no idea what the pop up comes from, but I suspect you have some 
non-trivial code in that python program, perhaps that creates a gui.  Is 
there any tkinter stuff in it?


As for "Access is Denied", it usually means you tried to access a 
non-existent drive, or one which isn't currently mounted.  For example, 
referencing your CD drive with no platter in it.


I don't know why "print_unicode_table" is undefined, but apparently 
you're missing some code.



And the except clause changed between 2.x and 3.x, so you need to change 
the syntax to match the particular interpreter you're using.  They're 
not compatible, although there's a utility to convert from 2.x to 3.x, I 
don't think there's anything that reverses it.


I'd suggest picking one version, and using only books and references 
that are compatible with it till you're comfortable with the language.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] First steps for C++/Qt developers

2010-05-12 Thread Dave Angel




M. Bashir Al-Noimi wrote:




   
humm, you confused me I'm still a newbie and I don't know anything 
about differences between C++ & python even I couldn't understand you. 
How C++ is a static language !??!!


In C++, every variable is declared, and the type of that variable is 
static over its lifetime.  The only flexibility there is that a variable 
may also get a value of some derived type of its declared type.  In 
Python, variables have no fixed type at all, only the objects (that 
they're bound to) have type.  A variable can be an integer one time, a 
string the next, and an arbitrary object after that.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Find Elements in List That Equal A Specific Value

2010-05-12 Thread Dave Angel


Su Chu wrote:

Hi there,

I am new to Python. I am attempting to either define a "which" statement or
to find a method that already exists to do this sort of operation.

My problem is as follows:
I have three lists, one with unique values (list 1), one a sequence of
values that are not necessarily unique (list2), and a shorter list with the
unique values of list 2 (list 3). List 1 and List 2 are of equal lengths.


An example:
list1 = [ 1, 2, 3, 4, 5, 6 ]
list2 = [ 2, 2, 2, 5, 6, 6 ]
list3 = [2, 5, 6]

What I would like to do is find and sum the elements of list 1 given its
corresponding element in list 2 is equal to some element in list 3.

For example,
the sum of the values in list1 given list2[i]==2
would be 1 + 2 + 3 = 6.
the sum of the values in list1 given list2[i]==5
would be 4
the sum of the values in list1 given list2[i]==6
would be 5 + 6 = 11

and so on. Obtaining these values, I'd like to store them in a vector.

This seems pretty simple if a 'which' statement exists e.g. (which values in
list 1 == list3[k], and looping through k), but I can't find one. To write
one seems to require a loop.

  

What's wrong with a loop?

I'm at a loss, if you could help I'd really appreciate it!

  
If this homework has a requirement that says don't use a loop, or don't 
use Python 3, or don't get the answer from the internet, how about 
saying so?


To see if something is in a list, use:
 if x in list3

To combine two lists, use zip()

To get more help, post some code that you've actually tried, and tell us 
why you think it's wrong.  Then we can help fix it, rather than just 
solving the homework directly.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] raw_input a directory path

2010-05-12 Thread Dave Angel




Spencer Parker wrote:

Here is the code:
http://dpaste.com/hold/193862/

It still isn't working for me.  I don't see
it hitting the first for loop or even the second one.  It runs without an
error at all.

I am inputing the directory as: \\Documents\ and\
Settings\\user\\Desktop\\test
  

When using raw_input(), no characters are substituted and none need 
escaping.  It's not a literal to need double-backslashing, and it's not 
a Unix shell, to need escaping of the space character.  What you type is 
what you get, other than things like backspace and enter.


Prove it to yourself with print, and then type it straight.  You might 
also add an extra (untested) :


if  not (os.path.exists(directory) and os.path.isdir(directory)):
  print "Not a valid directory"


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Design Question: File Object used everywhere

2010-05-14 Thread Dave Angel


Jan Jansen wrote:

Hi there,

I'm working on a code to read and write large amounts of binary data 
according to a given specification. In the specification there are a 
lot of "segments" defined. The segments in turn have defintions of 
datatypes and what they represent, how many of some of the data values 
are present in the file and sometimes the offset from the beginning of 
the file.


Now I wonder, what would be a good way to model the code.

Currently I have one class, that is the "FileReader". This class holds 
the file object, information about the endianess and also a method to 
read data (using the struct module). Then, I have more classes 
representing the segements. In those classes I define data-formats, 
call the read-method of the FileReader object and hold the data. 
Currently I'm passing the FileReader object as arguement.


Here some examples, first the "FileReader" class:

class JTFile():

def __init__(self, file_obj):
self.file_stream = file_obj
self.version_string = ""
self.endian_format_prefix = ""

def read_data(self, fmt, pos = None):
format_size = struct.calcsize(fmt)
if pos is not None:
self.file_stream.seek(pos)
return struct.unpack_from(self.endian_format_prefix + fmt, 
self.file_stream.read(format_size))


and here an example for a segment class that uses a FileReader 
instance (file_stream):


class LSGSegement():

def __init__(self, file_stream):
self.file_stream = file_stream
self.lsg_root_element = None
self._read_lsg_root()

def _read_lsg_root(self):
fmt = "80Bi"
raw_data = self.file_stream.read_data(fmt)
self.lsg_root_element = LSGRootElement(raw_data[:79], 
raw_data[79])


So, now I wonder, what would be a good pythonic way to model the 
FileReader class. Maybe use a global functions to avoid passing the 
FileReader object around? Or something like "Singleton" I've heard 
about but never used it? Or keept it like that?


Cheers,

Jan



I agree with Luke's advice, but would add some comments.

As soon as you have a global (or a singleton) representing a file, 
you're making the explicit assumption that you'll never have two such 
files open.  So what happens if you need to merge two such files?  Start 
over?  You need to continue to pass something representing the file 
(JTFile object) into each constructor.


The real question is one of state, which isn't clear from your example.  
The file_stream attribute of an object of class JTFile has a file 
position, which you are implitly using.  But you said some segments are 
at fixed positions in the file, and presumably some are serially related 
to other segments. Or perhaps some segments are really a section of the 
file containing smaller segments of different type(s).


Similarly, each object, after being created, probably has relationship 
to other objects.  Without knowing that, you can't design those object 
classes.


Finally, you need to decide early on what to do about data validation.  
If the file happens to be busted, how are you going to notify the user.  
If you read it in an ad-hoc, random order, you'll have a very hard time 
informing the user anything useful about what's wrong with it, never 
mind recovering from it.


It's really a problem in serialization, where you read a file by 
deserializing.  Consider whether the file is going to be always small 
enough to support simply interpreting the entire stream into a tree of 
objects, and then dealing with them.  Conceivably you can do that 
lazily, only deserializing objects as they are referenced.  But the 
possibility of doing that depends highly on whether there is what 
amounts to a "directory" in the file, or whether each object's position 
is determined by the length of the previous one.


In addition to deserializing in one pass, or lazily deserializing, 
consider deserializing with callbacks. In this approach you do not 
necessarily keep the intermediate objects, you just call a specified 
user routine, who should keep the objects if she cares about them, or 
process them or ignore them as needed.


I've had to choose each of these approaches for different projects, and 
the choice depended in large part on the definition of the data file, 
and whether it could be randomly accessed.


DaveA


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help

2010-05-15 Thread Dave Angel

she haohao wrote:
> Hi,
>
> I have some questions that I am unable to figure out. 
>
> Let say I have a file name peaks.txt.
>
> Chr17   9  4.5 5.5
> chr10   6   9  3.5 4.5
> chr1 10 6  2.5 4.4
>
> Question is how can i sort the file so that it looks like this:
>
>
>
> Chr17   9  4.5 5.5
> chr1 10 6  2.5 4.4
> chr10   6   9  3.5 4.5
>
> Next is how do I extract out the p-values(those highlighted in red)
>
> After I extracted out all the p-values. for example all the p-values from 
> chr1 is 6,7,9,10 and for chr10 are 6 and 9.
>
> So for example if the p-value is 7 from chr1, i would open out a file called 
> chr1.fa which look like this:
>
>   
>> chr1
>> 
> ATTGTACT
> ATTTGTAT
> ATTCGTCA
>
> and I will extract out the subsequence TACTA. Basically p-value(in this case 
> its 7) position counting from second line of the chr1.fa file and print out 
> the subsequence from starting from position 7-d and 7+d, where d=2. Thus if 
> the p-values is taken from chr10 then we read from the a file with file name 
> chr10.fa which can look like like:
>
> chr10
> TTAGTACT
> GTACTAGT
> ACGTATTT
>
> So the question is how do I do this for all the p-values.(i.e all the 
> p-values from chr1 and all the p-values from chr10) if let say we dont know 
> peaks.txt files have how many lines.
>
> And how do i output it to a file such that it will have the following format:
>
> Chr1
>
> peak value 6: TTGTA
>
> peak value 7: TACTA
>
> etc etc for all the p-values of chr1
>
> chr10
>
> peak value 7: TTACT
>
> etc etc etc...
>
>
> thanks for the help,
> Angeline
>
>
>   
Red has no meaning in a text message, which is what this list is
comprised of.

What does your code look like now? Where are you stuck?

str.split() can be used to divide a line up by whitespace into "words".
So if you split a line (string), you get a list. You can use use [] to
extract specific items from that list.

The first item in that list is your key, so you can then put it into a
dictionary. Don't forget that a dictionary doesn't allow dups, so when
you see the dictionary already has a match, append to it, rather than
replacing it.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help

2010-05-15 Thread Dave Angel

(You forgot to post to the list. Normally, you can just do a reply-all
to get both the list, and the person who last responded. You also
top-posted, rather than putting your new message at the end. I'll now
continue at the end.)

she haohao wrote:
> I am stuck because i dont know how do i extract all the p values and how do i 
> sort the file and how i open the respective file. Thanks for helping
>
>   
>> Date: Sat, 15 May 2010 19:58:33 -0400
>> From: da...@ieee.org
>> To: einstein...@hotmail.com
>> CC: tutor@python.org
>> Subject: Re: [Tutor] Help
>>
>> she haohao wrote:
>> 
>>> Hi,
>>>
>>> I have some questions that I am unable to figure out. 
>>>
>>> Let say I have a file name peaks.txt.
>>>
>>> Chr17   9  4.5 5.5
>>> chr10   6   9  3.5 4.5
>>> chr1 10 6  2.5 4.4
>>>
>>> Question is how can i sort the file so that it looks like this:
>>>
>>>
>>>
>>> Chr17   9  4.5 5.5
>>> chr1 10 6  2.5 4.4
>>> chr10   6   9  3.5 4.5
>>>
>>> Next is how do I extract out the p-values(those highlighted in red)
>>>
>>> After I extracted out all the p-values. for example all the p-values from 
>>> chr1 is 6,7,9,10 and for chr10 are 6 and 9.
>>>
>>> So for example if the p-value is 7 from chr1, i would open out a file 
>>> called chr1.fa which look like this:
>>>
>>>   
>>>   
 chr1
 
 
>>> ATTGTACT
>>> ATTTGTAT
>>> ATTCGTCA
>>>
>>> and I will extract out the subsequence TACTA. Basically p-value(in this 
>>> case its 7) position counting from second line of the chr1.fa file and 
>>> print out the subsequence from starting from position 7-d and 7+d, where 
>>> d=2. Thus if the p-values is taken from chr10 then we read from the a file 
>>> with file name chr10.fa which can look like like:
>>>
>>> chr10
>>> TTAGTACT
>>> GTACTAGT
>>> ACGTATTT
>>>
>>> So the question is how do I do this for all the p-values.(i.e all the 
>>> p-values from chr1 and all the p-values from chr10) if let say we dont know 
>>> peaks.txt files have how many lines.
>>>
>>> And how do i output it to a file such that it will have the following 
>>> format:
>>>
>>> Chr1
>>>
>>> peak value 6: TTGTA
>>>
>>> peak value 7: TACTA
>>>
>>> etc etc for all the p-values of chr1
>>>
>>> chr10
>>>
>>> peak value 7: TTACT
>>>
>>> etc etc etc...
>>>
>>>
>>> thanks for the help,
>>> Angeline
>>>
>>>
>>>   
>>>   
>> Red has no meaning in a text message, which is what this list is
>> comprised of.
>>
>> What does your code look like now? Where are you stuck?
>>
>> str.split() can be used to divide a line up by whitespace into "words".
>> So if you split a line (string), you get a list. You can use use [] to
>> extract specific items from that list.
>>
>> The first item in that list is your key, so you can then put it into a
>> dictionary. Don't forget that a dictionary doesn't allow dups, so when
>> you see the dictionary already has a match, append to it, rather than
>> replacing it.
>>
>> DaveA
>> 

I didn't offer to write it for you, but to try to help you fix what
you've written. When you have written something that sort-of works,
please post it, along with a specific question about what's failing.

sort() will sort data, not files. If you read in the original data with
readlines(), and sort() that list, it'll be sorted by the first few
characters of each line. Note that may not be what you mean by sorted,
since Chr10 will come before Chr2. Still it'll put lines of identical
keys together. After you sort the lines, you can create another file
with open(..."w") and use writelines(). Don't forget to close() it.

But you probably don't want it sorted, you want a dictionary. Of course,
if it's an assignment, then it depends on the wording of the assignment.

open() and read() will read data from a file, any file.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] PYTHON 3.1

2010-05-18 Thread Dave Angel

(Please don't top-post.  Add your comments to the end of the portion 
you're quoting.)


Dipo Elegbede wrote:

thanks a lot.

i was almost going to abandon this python again out of frustration. i have
done it before but with you guys around, it would never happen again.

i have a pdf version of python programming for absolute beginners, could
anyone please help me with its accompaning CD content?

thanks as i anticipate responses.

regards.
  
  

I don't know the version that your CD was written for.

If you're going to use a tutorial, it's smart to get a matching version 
of Python.  So if your tutorial is for 2.x, you should get Python 2.6 
(or soon, 2.7).  Otherwise, you'll be frequently frustrated by the 
differences.


They're not that bad, once you know the language.  But while you're 
learning, try to match your learning materials with your version.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Unit testing command-line options from argparse or optparse

2010-05-19 Thread Dave Angel


Serdar Tumgoren wrote:

Hello all,

Does anyone have advice for writing unit tests against variables set by
command-line options?

I have a program I'd like to run in either "debug" or "live" mode, with
various settings associated with each. Below is some pseudo-code that shows
what I'd like to do:

<>
mode = p.parse_args() #always set to either --debug or --live

if mode.live:
recipients = ['jsm...@email.com', 'jane...@email.com']
# set logging to a file
elif mode.debug:
recipients = ['ad...@admin.com']
# log to stdout

The "live" and "debug" attributes are set by command-line flags passed to
the argparse module. What I'd like to do is write tests that check whether
various settings (recipients, logging, etc.) are configured properly based
on the command-line options.

But if "mode" is not set until runtime, I clearly can't import it into my
suite of unit tests, right? Is there some standard testing approach to this
problem (perhaps mocking?) that you all can recommend?

I'd greatly appreciate it.
Serdar

  
I don't see the problem.  If 'mode' is a global in module  doit.py, then 
just use

   import doit
and later,
   if doit.mode.debug

The only tricky thing is to make sure that the initialization code has 
been run before you actually use the latter code.


And, presuming mode is an object of a special class made for the 
purpose, you could go a bit further.  You could create the object (with 
default attributes) in the top-level code of doit.py.  That way it 
already exists when the import is finished.  And you could then use

   from doit import mode

Now, change the semantics of parse_args() so that it takes the mode 
object as a parameter, and modifies it according to the arguments.  As 
long as you don't reassign mode itself, the test code could continue to 
use its own "variable".  But once again, don't use the mode.debug until 
the initialization has been done.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] 2d list index inverting?

2010-05-26 Thread Dave Angel


Hugo Arts wrote:

On Wed, May 26, 2010 at 3:13 AM, Alex Hall  wrote:
  

Hello all,
I have a 2d list being used for a battleship game. I have structured
the program so that it uses a grid class, which implements this array
along with a bunch of other methods and vars. For example, to get at
the top left square, you would say:
Grid.getSquareAt(0,0)
and inside getSquareAt is simply:
 def getSquareAt(self, x, y):
 return self.b[x][y] #b is the internal 2d list for the class

However, I am getting very confused with indexing. I keep getting
errors about list index out of range and I am not sure why. I have a
feeling that using 2d lists is supposed to go like a matrix
(row,column) and not like a coordinate plane (column, row).



A 2D list doesn't really exist. What you're using is just a list whose
elements are also lists. A nested data structure. And whether those
sub-lists should be the rows or the columns? It doesn't matter. A list
is just a list. Sequential data elements. It doesn't care whether it
represents a row or a column. What are 'row' and 'column' anyway? just
words designating some arbitrary notion. Conventions. You can swap one
for the other, and the data remains accessible. As long as you're
consistent, there's no problem.

The real problem is something else entirely. Somewhere in your code,
you are using an index that is greater than the size of the list.
Perhaps you're not consistent, somewhere. Mixing up your row/column
order. Perhaps something else is amiss. No way to tell from the
snippet.

Hugo

  
My question would be how are you creating these lists, and how you're 
updating them.  If you're doing a 20x20 board, are you actually creating 
20 lists, each of size 20, in instance attribute b ?  Do you do that in 
the __init__() constructor?  Or are you doing some form of "sparse 
array" where you only initialize the items that have ships in them?


As for updating, are you always doing the update by assigning to 
self.b[row][col]  ?


You could always add a try/catch to the spot that crashes, and in the 
catch clause, temporarily print out the subscripts you're actually 
seeing.  As Hugo says, you could simply have values out of range.



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] parse text file

2010-06-03 Thread Dave Angel


Colin Talbert wrote:


You are so correct.  I'd been trying numerous things to read in this file 
and had deleted the code that I meant to put here and so wrote this from 
memory incorrectly.  The code that I wrote should have been:


import bz2
input_file = bz2.BZ2File(r'C:\temp\planet-latest.osm.bz2','rb')
str=input_file.read()
len(str)

Which indeed does return only 90.

Which is also the number returned when you sum the length of all the lines 
returned in a for line in file with:



import bz2
input_file = bz2.BZ2File(r'C:\temp\planet-latest.osm.bz2','rb')
lengthz = 0
for uline in input_file:
lengthz = lengthz + len(uline)

print lengthz


  

Seems to me for such a large file you'd have to use 
bz2.BZ2Decompressor.  I have no experience with it, but its purpose is 
for sequential decompression -- decompression where not all the data is 
simultaneously available in memory.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Misc question about scoping

2010-06-04 Thread Dave Angel


Alan Gauld wrote:



flag = True if  (smoeValue or another) else False

is different to

flag = someValue or another

Which was why I thought it worth pointing out that the if/else
could be used.



I'd prefer the form:

  flag = not not (someValue or another)

if I needed real True or False result.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Misc question about scoping

2010-06-04 Thread Dave Angel


Hugo Arts wrote:

On Fri, Jun 4, 2010 at 1:23 PM, Dave Angel  wrote:
  

I'd prefer the form:

 flag =ot not (someValue or another)




That's a construct you might commonly find in languages like C, but I
don't think it's very pythonic. If you want to convert your result to
a bool, be explicit about it:

flag =ool(some_value or another)

I'd agree that the if/else construct is redundant if you just want a
True/False result, but the double not is a kind of implicit type
conversion that is easily avoided.

Hugo

  
I'd certainly agree that bool() is better than not not.  And I admit a C 
background.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] no. of references

2010-06-08 Thread Dave Angel


Payal wrote:

Hi,
If I have a list (or a dict), is there any way of knowing how many other
variables are referencing the same object?

With warm regards,
-Payal
  

Depends on what you mean by variables.

Try   sys.getrefcount(mylist)


Naturally, the count will be one higher than you expect.  And you should 
only use this for debugging purposes.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Looking for duplicates within a list

2010-06-11 Thread Dave Angel


Ken G. wrote:
I have been working on this problem for several days and I am not 
making any progress.  I have a group of 18 number, in ascending order, 
within a list.  They ranged from 1 to 39.  Some numbers are duplicated 
as much as three times or as few as none.


I started with one list containing the numbers.  For example, they are 
listed as like below:


a = [1, 2, 3, 3, 4]

I started off with using a loop:

   for j in range (0, 5):
   x = a[0] # for example, 1

How would I compare '1' with 2, 3, 3, 4?
Do I need another duplicated list such as b = a and compare a[0] with 
either b[0], b[1], b[2], b[3], b[4]?


Or do I compare a[0] with a[1], a[2], a[3], a[4]?

In any event, if a number is listed more than once, I would like to 
know how many times, such as 2 or 3 times.  For example, '3' is listed 
twice within a list.


TIA,

Ken

I'm a bit surprised nobody has mentioned the obvious solution -- another 
list of size 40, each of which represents the number of times  a 
particular number has appeared.


(Untested)


a = [1, 2, 3, 3, 4]
counts = [0] * 40
for item in a:
counts[item] += 1

Now, if you want to know how many times 3 appears, simply
  print counts[3]

The only downside to this is if the range of possible values is large, 
or non-numeric.  In either of those cases, go back to the default 
dictionary.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] New to Programming

2010-06-12 Thread Dave Angel


Kaushal Shriyan wrote:

Hi,

I am absolutely new to programming language. Dont have any programming
experience. Can some one guide me please. is python a good start for
novice.

Thanks,

Kaushal

  

Like nearly all questions, the answer is "it depends."

Mainly, it depends on what your goal is.  In my case, I made my living 
with programming, for many years.  And in the process, learned and used 
about 35 languages, plus a few more for fun.  I wish I had discovered 
Python much earlier, though it couldn't have been my first, since it 
wasn't around.  But it'd have been much better than Fortran was, for 
learning.


So tell us about your goals.  Abstract knowledge, console utilities, gui 
development, games, web development, networking communication, ...


Next, you might want to evaluate what you already know.  There are a lot 
of non-programming things that a programmer needs to understand.  If you 
already know many of them, that's a big head start.  If you already know 
how to administer a Linux system, you're already a programmer and didn't 
know it.  If you write complex formulas for Excel, you're a programmer.  
If you already know modus ponens, and understand what a contrapositive 
is, you've got a head start towards logic (neither is a programming 
subject, just a start towards logical thinking).  If you've worked on a 
large document, and kept backups of  incremental versions, so you could 
rework the current version based on earlier ones, that's a plus.  If you 
know why a file's timestamp might change when you copy it from hard disk 
to a USB drive and back again, you've got a head start.  If you know why 
it might have a different timestamp when you look at it six months from 
now without changing it, you've got a head start.


If you're using Windows and never used a command prompt, you have a ways 
to go.  If you don't know what a file really is, or how directories are 
organized, you have a ways to go.  And if you think a computer is 
intelligent, you have a long way to go.


Python is a powerful tool.  But if you're totally new to programming, it 
can also be daunting.  And most people have no idea how easy some 
programs are, nor how hard some other programs are, to build.


In any case, some of the things recommending Python as a first language are:
  1) an interactive interpreter - you can experiment, trivially
  2) very fast turnaround, from the time you make a change, till you 
can see how it works.  This can be true even for large programs

  3) this mailing list

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] string encoding

2010-06-18 Thread Dave Angel

Rick Pasotto wrote:

I can print the string fine. It's f.write(string_with_unicode) that fails with:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: 
ordinal not in range(128)

Shouldn't I be able to f.write() *any* 8bit byte(s)?

repr() gives: u"Realtors\\xc2\\xae"

BTW, I'm running python 2.5.5 on debian linux.

You can write any 8 bit string.  But you have a Unicode string, which is 
16 or 32 bits per character.  To write it to a file, it must be encoded, 
and the default encoder is ASCII.  The cure is to encode it yourself, 
using the encoding that your spec calls for.  I'll assume utf8 below:

>>> name = u"Realtors\xc2\xae"
>>> repr(name)
"u'Realtors\\xc2\\xae'"
>>> outfile = open("junk.txt", "w")
>>> outfile.write(name)
Traceback (most recent call last):
 File "", line 1, in 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 
8-9: ordin

al not in range(128)
>>> outfile.write(name.encode("utf8"))
>>> outfile.close()

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Question

2010-06-19 Thread Dave Angel


Alan Gauld wrote:
"Independent Learner"  wrote
~I was wondering if I should try to learn 2 programming languages at 
once, Python and C++. 


No, no no! If it had been a different pair I might have said try it. 
But C++ is one of the most difficult, complex and difficult 
programming lamnguages out there. It is full of subtle things that can 
trip you up and cause very weird and subtle bugs that are diffficult 
to find. And it has similar concepts to Python but implemented so 
entirely differently that studying the two together will be an 
exercise in frustration.


Part of the reason why C++ is so difficult is because it is so 
powerful. You have full access to the machine through the C language 
elements, plus a full OOP environment, plus a powerful generic type 
system. Plus it combines static and dynamic variables with a reference 
model all with slightly different syntax and semantic behaviours.


At work I hardly ever recommend that people go on language training 
courses, C++ is the exception! You can learn C++ by yourself but you 
will need a good book and a lot of time and patience.


Obviously I am working on learning python right now, I have gotten up 
to Classes


Stick with Python and get comfortable with that.


I concur 100% with Alan's advice to learn one language thoroughly first, 
and I think Python is the one that would have best suited me as a first 
language, if it had been available.  Python was approximately number 30 
for me, not counting the languages I learned entirely for personal 
reasons.  Currently on the job I mostly use C++, Python, and Perl.



C++ was indeed the hardest language for me to learn, but was the most 
rewarding for its time.  It's the only language I went to classes for, 
and one of the lectures was taught by Bjarne Stroustrup.  Fortunately, I 
had a coworker on the ANSI language standardization committee, and 
access to lots of other people who knew it rather well.


Once you have one or two languages very comfortably under your belt, 
then go ahead and learn anything else you like. But even then, if you 
have to learn two new ones at the same time, I'd recommend they be very 
unlike.  So you could learn Lisp or Forth at the same time as you were 
learning Ruby, but I'd not try to learn Perl and Python at the same 
time.  (Actually, Perl is driving me crazy at the moment, and I'm only 
using it because we have a series of large scripts written in it.)


I drive a motorcycle for my main transportation, and my wife's van 
otherwise.  And I never get confused between the controls on one and on 
the other.  But at one time I had a Harley with the brake on the left, 
and gearshift on the right (ie. normal at the time), and switching back 
and forth between that motorcycle and others almost caused an accident.  
When learning to ride, I can't imagine using two different motorcycles 
with swapped controls.


Good luck, and post questions here when you have them.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Time

2010-06-22 Thread Dave Angel


Ahmed AL-Masri wrote:
Hi, 
I would calculate the running time of my simulation code.

any one know how to do that?

example

def demo():
### the starting point of time should be 0
 f.simulate(data)
### the end of the class so need to find the time in Sec.?
### print time in sec.
if __name__ == '__main__':
demo()

look forward to seeing the answer,

Thanks a lot,
A. Naufal
  


If you're really looking to measure performance, you should use the 
timeit module.  But for simply deciding how much time has elapsed 
between two points in your code, you can use the time.time() function.


import time

start = time.time()
...  do some work
end = time.time()-start


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Repeat function until...

2010-06-23 Thread Dave Angel




Steven D'Aprano wrote:

On Wed, 23 Jun 2010 10:29:11 pm Nethirlon . wrote:
  

Hello everyone,

I'm new at programming with python and have a question about how I
can solve my problem the correct way. Please forgive my grammar,
English is not my primary language.

I'm looking for a way to repeat my function every 30 seconds.




The easiest way is to just run forever, and stop when the user 
interrupts it with ctrl-D (or ctrl-Z on Windows):


# untested
def call_again(n, func, *args):
"""call func(*args) every n seconds until ctrl-D"""
import time
try:
while 1:
start = time.time()
func(*args)
time.sleep(n - (time.time()-start))
except KeyboardInterrupt:
pass

Of course, that wastes a lot of time sleeping.

  
But "wasting time" was the stated goal.  If there's nothing else the 
application needs to do, sleep() is perfect.  I'm sure you know, but 
maybe some others don't:  sleep() uses essentially no CPU time, so the 
other applications on the system get all the performance.


As an alternative, you need to look at threads. That's a big topic, you 
probably need to read a How To or three on threads.


  

Only if there's other things that need doing in the same application.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Use flag to exit? (OT - PEP 8 Gripe)

2010-06-25 Thread Dave Angel


Richard D. Moores wrote:

On Fri, Jun 25, 2010 at 09:48, Emile van Sebille  wrote:
  

On 6/25/2010 9:08 AM Steve Willoughby said...


On 25-Jun-10 08:23, Emile van Sebille wrote:
  

On 6/25/2010 1:33 AM ALAN GAULD said...


Copy and pasting is a PITA.


Why would you want to copy and paste?
  

Because it makes it easy to work on code. My preferred editor (TextPad)
allows block selection of indented text such that I can copy and paste
functions and methods into the python CLI and work with them there. I


If what you're trying to do is a PITA, that should raise two questions:

  

Points taken, but understand that the OP termed it a PITA, from which Alan
asked why cut 'n paste.  I don't find it a PITA at all and am quite
comfortable with my approach (editor choice being a religious issue after
all).  My gripe is only that complying with PEP 8 is not possible.



I'm the OP. What I called a PITA is copy and pasting using the Windows
command line.

Dick

  
Do you activate Quick-Edit mode in your DOS box?  Once you have that on, 
it's not much of a pain to copy and paste, as long as what you're 
copying fits a rectangle.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Use flag to exit? (OT - PEP 8 Gripe)

2010-06-25 Thread Dave Angel


Richard D. Moores wrote:

On Fri, Jun 25, 2010 at 23:08, Dave Angel  wrote:
  

Do you activate Quick-Edit mode in your DOS box?  Once you have that on,
it's not much of a pain to copy and paste, as long as what you're copying
fits a rectangle.



Yes, it's on. I agree that copying is easy, but if I can't use Ctrl+V,
pasting is a pain. Tell me your secret.

Dick

  
If you right-click on a DOS box without anything selected, then it 
pastes.  (Asssuming quickedit mode)


For example, you run a command that displays a string, and you want to 
use that string as your next command:


left-drag over the desired text.
Right click to copy to clipboard
Right click again to paste into current cursor position

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Use flag to exit? (OT - PEP 8 Gripe)

2010-06-26 Thread Dave Angel


Richard D. Moores wrote:

On Fri, Jun 25, 2010 at 23:42, Dave Angel  wrote:
  

Richard D. Moores wrote:


On Fri, Jun 25, 2010 at 23:08, Dave Angel  wrote:

  

Do you activate Quick-Edit mode in your DOS box?  Once you have that on,
it's not much of a pain to copy and paste, as long as what you're copying
fits a rectangle.



Yes, it's on. I agree that copying is easy, but if I can't use Ctrl+V,
pasting is a pain. Tell me your secret.

Dick


  

If you right-click on a DOS box without anything selected, then it pastes.
 (Asssuming quickedit mode)

For example, you run a command that displays a string, and you want to use
that string as your next command:

left-drag over the desired text.
Right click to copy to clipboard
Right click again to paste into current cursor position



Hey, terrific! Thanks, Dave! I didn't see any help available. Where'd
you learn this?

Dick

  
I really don't recall where I learned it, but I do remember 
approximately when.  It was about 1994, with Windows NT version 3.5



DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] the ball needs a kick...

2010-07-06 Thread Dave Angel


Schoap D wrote:

Hi,

I'm doing the exercises here: chapter 8
http://www.openbookproject.net/thinkCSpy/ch08.html

Now I have added another paddle to the pong game. So far so good, but the
ball isn't moving anymore and I am not able to fix it...
Any comments, tips, feedback?

Thanks in advance,

http://paste.pocoo.org/show/233739/


Dirk

  
First thing to do is a file diff with the last version that worked 
properly.  You do that by going into your version control system.  
Chances are you made one more change than you realized you were making.


DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] extract a submatrix

2010-07-12 Thread Dave Angel




Bala subramanian wrote:

Friends,
Excuse me if this question is not appropriate for this forum. I have a
matrix of size 550,550. I want to extract only part of this matrix say first
330 elements, i dnt need the last 220 elements in the matrix. is there any
function in numpy that can do this kind of extraction. I am quite new to
numpy. How can do the same ?

Thank you,
Bala

  
I don't know numpy, and it probably would be better to use that forum.  
But there are several people here who do, and one of them will probably 
help.


However, I would point out that if you fetch the first 220 elements of a 
550x550 matrix, you'll have 302170 elements left.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help

2010-07-13 Thread Dave Angel


Dipo Elegbede wrote:

I was trying to write a code that prints prime numbers between 1 and 20.

I have by myself seen that something is wrong with my code and also my
brain.

Could anyone be kind enough to tell me what to do

Where I am confused is how to test for other numbers without one and the
number itself. It turns out that all numbers pass the condition I set so
that would presuppose that all numbers are prime which is not.

How exactly can I get it to run checks with other numbers such that it
doesn't include the number itself and 1.

The code is as follows:

for i in range(1,20):

if float(i) % 1 == 0 and float(i) % i == 0:
print i, 'is a prime number'


  
Break the problem down.  Instead of solving the "print all the primes 
from 1 to 20", first solve the "Is a given number prime".


then once you have a solution to that one, write a loop that calls it 20 
times, printing its conclusions.


So suppose you have the number 12.  How would you manually decide if 
it's prime?  You'd find the remainder for all the numbers between 2 and 
11, inclusive, and if *any* of those came out zero, you'd say it's not 
prime.


Write a function isprime() that expresses exactly that, returning False 
if any of the modulos came out zero, and True if they're all okay.  The 
function will have a loop, and inside the loop have a single if statement.


Test the function by calling it explicitly with various values.  Then 
when you're comfortable with that, solve the bigger problem as stated.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Global name not found, though clearly in use

2010-07-14 Thread Dave Angel


Corey Richardson wrote:
The entirety of 
my (incomplete and buggy) code is now available here: 
http://pastebin.com/QTNmKYC6  ..

Hmm..If I add a few debugging lines like that into my code, I get this:

Starting program
In class Hangman
done defs in class
eWordEntryBox defined
Exception in Tkinter callback
Traceback (most recent call last):
 File "C:\Python31\lib\tkinter\__init__.py", line 1399, in __call__
   return self.func(*args)
 File "C:/Users/Corey/Desktop/HangmanApp.py", line 48, in getLetter
   self.guess = eWordEntryBox.get()
NameError: global name 'eWordEntryBox' is not defined


  
Why do you indent the main code in this file?  In particular, you've 
defined the lines starting at


  1.
 top = tk.Tk()
  2.
 F = tk.Frame(top)
  3.
 F.pack()

as part of the class, rather than at top-level.  Therefore the 
eWordEntryBox is a class attribute, rather than a global symbol.


I think you need to unindent those lines, so they'll be module-level code.

There are a bunch more problems with the code, starting with the fact 
that you never instantiate a Hangman instance, and continuing to missing 
self parameters on some of the methods.


DaveA



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] A file containing a string of 1 billion random digits.

2010-07-17 Thread Dave Angel


Richard D. Moores wrote:

That's the goal of the latest version of my script at
. The best I've been able to do
so far is a file with 800 million digits.

But it seems the writing of 800 million digits is the limit for the
amount of memory my laptop has (4 GB). So my question is, how can I do
this differently? I'm pretty brand new to opening and writing files.
Here, I can't write many shorter lines, because the end result I seek
is one long string. But am I correct?

I'd appreciate any advice.

BTW line 29 was added after getting the outputs noted at the bottom of
the script. Using close() does seem to shorten the time it takes for
my laptop to become usable again, but it's not decisive. Sometimes I
have to reboot in order to get healthy again. (64-bit Vista).

BTW2 It's probably not obvious why the list comprehension (line 19)
has random.choice(d) where d is '0123456789'. Without that, the random
ints of 1000 digits would never begin with a '0'. So I give them a
chance to by prefixing one random digit using choice(d), and cutting
the length of the rest from 1000 to 999.

Dick

  
Your code is both far more complex than it need be, and inaccurate in 
the stated goal of producing random digits.


There's no need to try to have all the "digits" in memory at the same 
time.  As others have pointed out, since you're using write() method, 
there's no newline automatically added, so there's no problem in writing 
a few bytes at a time.


Your concern over a leading zero applies equally to two leading zeroes, 
or three, or whatever.  So instead of using str() to convert an int to a 
string, use string formatting.  Either the % operator or the format() 
method.  Just use a format that guarantees zero-filling to the same 
width field as the size limit you supplied.


So generate a number between 0 and 99, or whatever, and convert 
that to a fixed width string possibly containing leading zeros.  Write 
the string out, and loop back around.


DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1736 matches

Mail list logo