from:"Danny Yoo"

Re: [Tutor] gzip (fwd)

2004-12-02 Thread Danny Yoo

Hi Ramkumar,

I'm forwarding your message to Python-tutor; in your replies, please make
sure that you are using the "reply-to-all" feature in your email client.
This will allow your response to reach the others on the tutor list.
Don't rely on me alone: give the community the chance to help you.

I don't have enough information to pinpoint what the problem is yet.  I'll
have to ask more questions, and others on the Tutor list will probably
also ask a few questions.  Please try to answer them, because that will
help us give a better idea of what the problem is.


It looks like you are trying to zip up whole directories.  Does the
program work if you zip up single files?

It also appears that you're using the '-q' and '-r' options of the 'zip'
command line utility.  '-q' stands for 'quiet' mode, and although that's
nice when the command is working properly, it's not helpful when you're
debugging a situation.  Try turning quiet mode off, so that you have a
better chance of getting good error output from the zip command.  Even
better, try enabling verbose mode, so you can see better what 'zip' is
attempting to do.

Do you see anything else when you execute the program?  Does anything else
come out of standard error?


Good luck to you.


-- Forwarded message --
Date: Fri, 3 Dec 2004 10:24:15 +0600
From: Ramkumar Parimal Alagan <[EMAIL PROTECTED]>
To: Danny Yoo <[EMAIL PROTECTED]>
Subject: Re: [Tutor] gzip

This is what i intend to do:

1. The files and directories to be backed up are given in a list.
2. The backup must be stored in a main backup directory.
3. The files are backed up into a zip file.
4. The name of the zip archive is the current date and time.

the coding:

__

import os
import time

source = ['D:\\down', 'D:\\Pics']

target_dir = 'D:\\backup'

target = target_dir + time.strftime('%Y%m%d%H%M%S') + '.zip'

zip_command = "zip -qr '%s' %s" % (target, ' '.join(source))

if os.system(zip_command) == 0:
print 'Successful backup to', target
else:
print 'Backup FAILED'

_

result : Backup FAILED

whats wrong ?

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] comapring lists

2004-12-03 Thread Danny Yoo

On Thu, 2 Dec 2004, Jacob S. wrote:

> If you or anybody else is interested, I've written a script for codes like
> kids in junior high use to write notes to each other with...

Hi Jacob,

Cool!  Do you mind if I make some comments on the code?  If you do mind...
um... skip this message.  *grin*

The main body of the program feels a bit too long: it screams to be broken
into a few helper functions.  I see that there are two critical variables
that are used to figure out which part of the program comes next:

###
unordo = raw_input('Are we going to decipher or cipher? ')
type = raw_input('Which type of code would you like? ').lower()

if type == 'mixed letters':
if unordo == 'cipher':
# ... do mixed letter ciphering
if unordo == 'decipher':
# ... do mixed letter deciphering
if type == 'insideout':
if unordo == 'cipher':
# ... do insideout ciphering
if unordo == 'decipher':
# ... do mixed letter decipering
# ... rest of the program follows similar structure
###

In a case like this, we can break the program into separate functions,
like this:

###
def dispatchOnTypeAndUnordo(type, unordo):
if type == 'mixed letters':
if unordo == 'cipher':
mixedLetterCipher()
if unordo == 'decipher':
mixedLetterDecipher()
if type == 'insideout':
if unordo == 'cipher':
insideoutCipher()
if unordo == 'decipher':
insideoutDecipher()
# ... rest of the program follows similar structure
###

We make each 'body' of the inner "if"'s into their own functions, like
'mixedLetterCipher()'.  This restructuring doesn't improve the program's
performance at all, but it does help readability: the main improvement is
to make the overall shape of the program all visible at once.

This structural change also paves the way for a "table-driven" way to
implement a decision tree.  Experienced programmers really try to avoid
code that looks like "if/if/if/if/if..." because that's probably some kind
of repeating structure that we can take advantage of.

The logic on the function dispatchOnTypeAndUnordo() has an internal rhythm
that we can capture as a data structure.  Here's a dictionary that tries
to capture the essentials of the beat:

###
dispatchTable = { 'mixed letters': (mixedLetterCipher,
mixedLetterDecipher),
  'insideout': (insideOutCipher,
insideOutDecipher),
   ## ... rest of the dictionary follows similar structure
}
###

[Note: the values in this dictionary --- the function names --- are
intentionally without parentheses.  We don't want to "call" the functions
just yet, but just want to store them off.]

If we have a dispatch table like this, then the dispatchOnTypeandUnordo()
magically dissolves:

###
def dispatchOnTypeAndUnordo(type, unordo):
(cipherFunction, decipherFunction) = dispatchTable[type]
if unordo == 'cipher':
cipherFunction()
elif unordo == 'decipher':
decipherFunction()
###

This is a "table-driven" or "data-driven" approach.  The choices available
to the program have been migrated away from the explicit, separate 'if'
statements, and now really live as part of the 'dispatchTable' dictionary.

Does this make sense so far?  Please feel free to ask questions.

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help

2004-12-03 Thread Danny Yoo

On Thu, 2 Dec 2004, Daniel wrote:

> I need to know how to run another module through one

Hi Daniel,

Have you had a chance to look at one of the tutorials on:

http://www.python.org/moin/BeginnersGuide_2fNonProgrammers

I'm guessing that youe trying to use modules; most of the tutorials show
examples of modules in action.  Here's a link to one of them:

http://www.freenetpages.co.uk/hp/alan.gauld/tutfunc.htm

I could be completely misunderstanding your question though; if I am,
please be patient with me; I need to sleep more.  *grin*

> what is the command do I have to include a path name
> and if there is any special way I have to save the file

I'm not quite sure I understand your other questions yet.  I'm guessing
that this is related to your first question, but I'm still slightly stuck.
Can you tell us what you are trying to do?

Good luck!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] unittest.makeSuite question

2004-12-03 Thread Danny Yoo

On Thu, 2 Dec 2004, Kent Johnson wrote:

> makeSuite() creates a TestSuite object that consolidates all the test
> methods of the class named by the first argument. Test methods are
> identified by having names starting with the second argument.

[text cut]

The unit testing framework is closely related to a similar system written
in Java called "JUnit", and the JUnit folks have written a nice tutorial
on the concepts behind their framework:

   http://junit.sourceforge.net/doc/testinfected/testing.htm

I highly recommend the article... even if the programming language syntax
seems a bit stuffy.  *grin*

The authors do a good job to explain the core ideas behind unit testing.
One is that a "suite" is a collection of tests.  With a suite, we can glue
a bunch of tests together.

Being able to aggregate tests doesn't sound too sexy, but it's actually
very powerful.  As a concrete example, we can write a module that
dynamically aggregates all unit tests in a directory into a single suite:

###
"""test_all.py: calls all the 'test_*' modules in the current
directory.

Danny Yoo ([EMAIL PROTECTED])
"""

import unittest
from glob import glob
from unittest import defaultTestLoader

def suite():
s = unittest.TestSuite()
for moduleName in glob("test_*.py"):
if moduleName == 'test_all.py': continue
testModule = __import__(moduleName[:-3])
s.addTest(defaultTestLoader.loadTestsFromModule(testModule))
return s

if __name__ == '__main__':
unittest.TextTestRunner().run(suite())
###

Here, we search for all the other test classes in our relative vicinity,
and start adding them to our test suite.  We then run them en-masse.  And
now we have a "regression test" that exercises all the test classes in a
package.

Hope this helps!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python 2.4 IDLE Windows 2000

2004-12-03 Thread Danny Yoo



> >Yowza; that's some bug.  Danny, do you happen to know the bug number?
> >I can't find it on sourceforge.
>
> It's been like that since 2.3 as far as I know. It generates a
> connection to localhost to run code in a separate environment.

Hi Mike,

I wish I knew what the problem was in better detail.  IDLE is a
part of the Standard Library, so it's actually possible to try turning on
individual pieces of it, one after the other.  Maybe that will help us
debug what's going on.

Start up your console version of Python, and try:

###
>>> import idlelib.PyShell
>>> idlelib.PyShell.main()
###

That should start IDLE up manually, and if anything bad happens, at least
we should see some errors pop up that will help us debug the situation.
Let's make sure this doesn't fail quietly.  *grin*


Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python 2.4 IDLE Windows 2000

2004-12-03 Thread Danny Yoo

On Fri, 3 Dec 2004, Mike Hansen wrote:

> That rooted out the problem. A while ago, I changed the colors to kind
> of match my VIM color theme(ps_color). When I did
> idlelib.PyShell.main(), IDLE came up with my custom color theme.
> However, there was a bunch of warnings about my theme. From IDLE, I
> deleted the theme. Now IDLE will launch normally. I'll set up the color
> theme later. Maybe older color themes aren't compatible with the newer
> IDLE? The color theme must have been laying around. I didn't brute force
> it in or anything like that.

Hi Mike,

Ah, whew, I'm glad that actually worked.  *grin*

The information about the color theme problem is valuable to know: can you
send a message to the IDLE developers with a summary of the situation?
It's possible that a lot of other folks might be running into a similar
startup problem.  Let's make sure that no one else has to go through hoops
to get IDLE working again.  The IDLE development list is:

http://mail.python.org/mailman/listinfo/idle-dev

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Upgrade to 2.4

2004-12-05 Thread Danny Yoo

On Sat, 4 Dec 2004, William Allison wrote:

>  I compiled Python 2.3.4 from source, but now I would like to upgrade to
> 2.4.  There doesn't seem to be a "make uninstall" target for 2.3.4.
> Will compiling 2.4 overwrite the older version, or will I have two
> versions of Python on my system?

Hi Will,

According to the README, you can install Python 2.4 in a way that doesn't
overwrite your older version of Python.  Here's a snippet from the README:

"""
If you have a previous installation of Python that you don't
want to replace yet, use

make altinstall

This installs the same set of files as "make install" except it
doesn't create the hard link to "python" named "python" and
it doesn't install the manual page at all.
"""

This should install '/usr/local/bin/python2.4', but otherwise, it should
leave the rest of your Python 2.3.4 installation intact.

Hope this helps!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] eval and exec

2004-12-05 Thread Danny Yoo


> I don't want to introduce insecurity.  But also I want to really
> understand what the problem is -- especially because I teach python.

Hi Marilyn,


Here is an example of a string that can cause a StackOverflow error to
happen:

###
s = "(lambda loop: loop(loop)) (lambda self: self(self))"
eval(s)
###

The string 's' here looks funky, but in effect, it's definition is an
infinite loop in heavy lambda disguise.  (Well, it would have been
infinite if Python had tail call optimization... *grin*)


The problem about eval() is that it's deceptively powerful: a single
expression in a language might seem like a small thing.  But as soon as we
allow someone the ability to evaluate a single arbitrary expression, we've
basically given them the ability to do practically anything in Python.
eval() is THAT POWERFUL.


Here's another example:

###
def myint(x):
"""Makes an integer out of x."""
return eval(x)

print myint("41") + 1
print myint("42 and __import__('os').system('tail /etc/passwd')")
###



> And I can't see the security problem, unless there's a security problem
> already, like if I allowed incoming email to dictate the parameters that
> I send through the socket.  The email provides data for argv[1:] but
> argv[0] is hard-coded.

The problem is one of capability.  At worse, a function like:

###
def myint(x):
return int(x)

if __name__ == '__main__':
print myint(sys.argv[1]) + 1
###

can raise an exception if given weird command line arguments, but it at
least doesn't give the caller the ability to run an arbitrary shell
command.  Contrast this situation to the version of myint() that uses
eval().


Does this make sense?  Please ask more questions on this if you have any:
using eval() is almost certainly not a good idea unless you really know
what you're doing.

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] eval and exec

2004-12-05 Thread Danny Yoo

On Sun, 5 Dec 2004, Alan Gauld wrote:

> > And I can't see the security problem, unless there's a security
> > problem already, like if I allowed incoming email to dictate the
> > parameters that I send through the socket.  The email provides data
> > for argv[1:] but argv[0] is hard-coded.
> >
> > And I don't see how web traffic can get there at all.
>
> You can generate a CGI call b by typing the GET string straight into the
> address bar of the browser. If a smart user realises that some of the
> parameters are being interpreted they can (and often will) try to fake
> what the page genersates, this could involve putting python commands,
> such as 'import os;os.system("rm -f /")' into the escape string...
>
> Equally if you embed Python in a program and allow users to type strings
> whoich are then exec() or eval()'d they could type a similar os.system()
> command. Or they might use print and dir to find variable names and
> manipulate those.

Hi Marilyn,

It pays to see a concrete example of an exploit that has occurred because
of exec/eval misuse.  For example, here's an old one from July 2002:

http://www.securityfocus.com/bid/5255/discussion/

Note that this one was in the Standard Library!  We'd expect that the
folks who implement the Standard Library should know what they are doing.
And if the Python implementors can have trouble using eval() safely, then
how much more should we be wary!

If you or your students are interested in security stuff, you may find
David Wheeler's guide on "Secure Programming for Linux and Unix HOWTO" a
good start:

  http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/index.html

It contains a small section specifically for Python:

  http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/python.html

I don't think that we should go completely crazy over security issues:
this issues are often so subtle that even experts get caught.  But even
so, I think we still have a responsibility to make sure the most
egregrious security problems never come to fruition.  So that's why most
of us here will say eval() and exec() are evil.  *grin*

I hope this helps!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Removing a row from a tab delimitted text

2004-12-06 Thread Danny Yoo

On Mon, 6 Dec 2004, kumar s wrote:

> Here is how my file looks:
>
> Name=3492_at
> Cell1=481 13 (The space between (481 and 13 is tab)
> Cell1=481 13
> Cell1=481 13
> Name=1001_at
> Cell1=481 13
> Cell2=481 12
> Cell1=481 13
> Cell1=481 13
> Cell2=481 12
> Name=1002_at
> Cell3=482 12
> Cell1=481 13
> Cell1=481 13
> Cell2=481 12
> Cell3=482 12
> Cell4=482 13
> Cell1=481 13
>
> My question:
>
> 1. How can I remove the line where Name identfier
> exists and get two columns of data.

Hi Kumar,

You may want to separate that question into two parts:

1. For a given file, how can I remove the lines where Name identfier
   exists?

2. For a given file, how can I get two columns of data?

This separation means that you don't have to solve the whole thing at once
to see progress.

If you do problem 2 first, then you can "hardcode" the input to something
that problem 2 can deal with.  That is, you can take a smaller version of
your input file, and manually remove the 'name' lines.  That way, you can
still do problem 2 without getting stuck on problem1.  And when you do get
problem 1 done, then you can just drop the 'hardcoded' test data.

What parts are you stuck on, and what have you tried so far?  Do you know
about using 'if' statements yet?  What do you know about list manipulation
so far?  What about string manipulation?

Please feel free to ask more questions.  Good luck!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python 2.3.5 out in January??

2004-12-07 Thread Danny Yoo

On Tue, 7 Dec 2004, Kent Johnson wrote:

> I think the idea is to back-port bugfixes from 2.4 to 2.3. Then that is
> the end of the line for 2.3.

Yes.  The policy for bugfix releases is in PEP 6:

http://www.python.org/peps/pep-0006.html

Here's what they say about bugfix releases:

"""
Bug fix releases are expected to occur at an interval of roughly
six months. This is only a guideline, however - obviously, if a
major bug is found, a bugfix release may be appropriate sooner. In
general, only the N-1 release will be under active maintenance at
any time. That is, during Python 2.4's development, Python 2.3 gets
bugfix releases. If, however, someone qualified wishes to continue
the work to maintain an older release, they should be encouraged.
"""

So that should help to explain Python 2.3.5 --- the Python developers just
want to make sure that the people who still use Python 2.3 are happy
campers.  *grin*

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] embedding python in a C app on a linux box...

2004-12-07 Thread Danny Yoo

On Tue, 7 Dec 2004, Jason Child wrote:

> Ok, so I have a decent grasp of python and have coded quite a few
> scripts. I must say that the language rocks. I would like to embed
> python into a C app to provide some scripting support.

Hi Jason,

We have to first mention that most of us here are beginning Python
programmers; few of us have done Python/C integration, so we're probably
not the best people to ask for help.  You may want to ask your
extension-building questions on comp.lang.python; I'm sure the experts
there will be happy to help you.  That being said, we'll do what we can.

> It would seem that my problem lies with not understanding the
> environment (linux) well enough. First off, I must include the Python.h
> header escplicitly (via #include "/usr/include/python2.3/Python.h"). How
> do I add the correct dir to the search path for the <> format?

This is controlled by adding a '-I/usr/include/python2.3' flag argument to
gcc, so that gcc adds that as part of its include path search.

 Second, when I call Py_Initialize() I get:
>
> /tmp/ccXZxNHZ.o(.text+0x11): In function `main':
> : undefined reference to `Py_Initialize'
> collect2: ld returned 1 exit status
>
> Now, I think it is because the linker isnt finding the right lib to
> attach. Is there a switch to use for gcc for make it? -L /path/to/libs
> perhaps?

You'll probably need '-lpython' so that it links Python to your
executable.  The uppercase '-L' flag is something else: it controls where
gcc looks for additional libraries, and I think it automatically include
'/usr/lib' by default.

You may find the stuff near the bottom of:

http://docs.python.org/ext/building.html

useful: it shows an example 'gcc' call that has all the flags that one
needs to get an extension built in Linux.

There's also an example of an embedded application that's in the Python
source tree.  It's under the Demo/embed directory, and may be a good
starting-off point.

But again, try asking your question on comp.lang.python.  I have to admit
that I haven't done embedding much, so there may be a better way to infer
those 'gcc' flags without hardcoded them in some kind of Makefile.

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Connecting to interactive program

2004-12-07 Thread Danny Yoo

On Tue, 7 Dec 2004, Vincent Nijs wrote:

> Has anyone ever tried to send commands to a running interactive python
> session from, say, the command line or another app?

Yes.  This sort of thing can be done through an "expect" script.

http://expect.nist.gov/

There's a port of expect for Python:

http://pexpect.sourceforge.net/

Out of curiosity though, do you really need to run Python interactively?

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem with python2.4.

2004-12-09 Thread Danny Yoo

On Thu, 9 Dec 2004, Jacob S. wrote:

> Nothing I can do can fix my problem. It appears as though
> pythonw.exe is not working properly in the python 2.4 distribution.

[some text cut]

> The symptoms: I click on edit with
> idle--which runs the command "C:\python24\pythonw.exe"
> "C:\python24\lib\idlelib\idle.pyw" -n -e "%1" --then the computer thinks for
> a bit and is silent.

Hi Jacob,

>From the symptom report, I would not automatically suspect pythonw: I'd
instead look at IDLE.

You might be running into the same kind of issue that Mike Hansen was
running into just a few days ago:

http://mail.python.org/pipermail/tutor/2004-December/033672.html

We were able to diagnose the problem by doing this:

http://mail.python.org/pipermail/tutor/2004-December/033726.html

Can you try the same procedure?  Even if it doesn't work, the information
that Python responses with should give us a better idea of what is going
on.  Try opening up a regular console version of Python, and see what
happens when you do:

###
>>> import idlelib.PyShell
>>> idlelib.PyShell.main()
###

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] cgi with system calls

2004-12-14 Thread Danny Yoo

On Tue, 14 Dec 2004, Nik wrote:

> ok, problem solved (well, partly - at least as much as I need it to be).
>
> status = os.system('ps') doesn't set status equal to the output text, it
> sets it to the return of the call (in this case '0'). What I really want
> to do is
>
> status = os.popen('ps').read()
> print status
>
> which works fine. However, why the first version confused the web server
> I don't know, I guess the output from ps went somewhere it wasn't
> supposed to.

Hi Nik,

Yes, the output from 'ps' went to standard output.

In fact, it turns out that Python does some internal buffering on its
output.  The external commands, though, flush their own output when
they're done.  That's at the heart of the problem you were seeing.

With the program:

###
print "Content-type: text/plain\n\n"
status = os.system(cmd)
###

The output from the os.system() came out first, and then the content-type
header.  So your web server got to see something like:

###
PID TTY  TIME CMD
3649 pts0 00:00:00 su
3652 pts0 00:00:00 bash
5197 pts0 00:00:00 ps
Content-type: text/plain

###

Hope this helps!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

RE: [Tutor] Regexp Not Matching on Numbers?

2004-12-14 Thread Danny Yoo



On Tue, 14 Dec 2004, Gooch, John wrote:

> I am used to ( in Perl ) the entire string being searched for a match
> when using RegExp's. I assumed this was the way Python would do it do,
> as Java/Javascript/VbScript all behaved in this manner. However, I found
> that I had to add ".*" in front of my regular expression object before
> it would search the entire string for a match. This seems a bit unusual
> from my past experience, but it solved the issue I was experiencing.

Hi John,


The question actually comes up a lot.  *grin*  If you're interested,
here are two more references that talk about "match() vs search()":


Python Standard Library docs on 're' module:
http://www.python.org/doc/lib/matching-searching.html


Regular Expression HOWTO:
http://www.amk.ca/python/howto/regex/regex.html#SECTION00072


Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting and editing large data files

2004-12-16 Thread Danny Yoo

On Thu, 16 Dec 2004, Scott Melnyk wrote:

> I recently suffered a loss of programming files (and I had been
> putting off my backups...)

Hi Scott,

[Side note that's not really related to Python: if you don't use a version
control system to manage your software yet, please learn to use one.
There's a very good one called Subversion:

http://subversion.tigris.org/

A good version control system is invaluable to programmers, since many
programs are not really written once, but are rather maintained and
revised for a long time.]

Rich Krauter already caught the bug that was occurring: the intersection()
method of sets produces a brand new set, rather than do a mutation on the
old set.

Here are some more comments on the programs you've shown us:

> A sample of the file format is:
>
> >ENSE1384652.1|ENSG0166157.5|ENST0359693.1
> assembly=NCBI35|chr=21|strand=reverse|bases 10012801 to 10012624|exons
> plus upstream and downstream regions for exon
> GCGGCCGTTCAAGGCAGCCGTCTCCGAGCGGCCCAAGGGAGGGCACAACAGCTGCTACCTGAACAGTTTCTGACCCAACAGTTACCCAGCGCCGGACTCGCTGCGGGCGGCTCTAGGGACGGCGCCTACACTTAGCTCCGCGCCCGAGGTGAGCCCAG

Ah, ok, so this is a FASTA file.

(For others on the list, see:

http://ngfnblast.gbf.de/docs/fasta.html

for a description of the BLAST format.)

> ENSExxx is an exon id tag  followed by a ENSGxgene id tag then
> a ENSTxxx transcript id tag followed by information about the
> location of exon.

[text cut]

Ok, so it sounds like your program will mostly pay attention to each
description line in the FASTA file.

> In order to visually understand the data better I made a script to
> organize it into sections with Gene ID, then Trancript ID followed by
> the different Exon IDs like so:

[lots of text cut]

There's one big assumption in the code to OrganizeData.py that may need to
be explicitely stated: the code appears to assume that the sequences are
already sorted and grouped in order, since the code maintains a variable
named 'sOldGene' and maintains some state on the last FASTA sequence that
has been seen.  Is your data already sorted?

As a long term suggestion: you may find it easier to write the extracted
data out as something that will be easy to work with for the next stages
of your pipeline.  Human readability is important too, of course, so
there's a balance necessary between machine and numan convenience.

If possible, you may want to make every record a single line, rather than
have a record spread across several lines.  Your program does do this in
some part, but it also adds other output, like announcements to signal the
start of new genes.  That can complicates later stages of your analysis.

Eric Raymond has summarized the rules-of-thumb that the Unix utitilies try
to follow:

http://www.faqs.org/docs/artu/ch05s02.html#id2907428

As a concrete counterexample, the 'BLAST' utility that bioinformaticians
use has a default output that's very human readable, but so ad-hoc that
programs that try to use BLAST output often have to resort to fragile,
hand-crafted BLAST parsers.  The situation's a lot better now, since newer
versions of BLAST finally support a structured format.

So I'd strongly recommend dropping the "NEW GENE", "end of gene group",
and "NEW TRANSCRIPT" lines out of your output.  And if you really want to
keep them, you can write a separate program that adds those notes back
into the output of OrganizeData.py for a human reader.

If you drop those decorations out, then the other parts of your pipeline
can be simplified, since you can assume that each line of the input file
is data.  The other stages of your analysis pipeline appear to try to
ignore those lines anyway, so why put them in?  *grin*

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] least squares

2004-12-16 Thread Danny Yoo

On Thu, 16 Dec 2004, mdcooper wrote:

> I am trying to run a least squares program (written by Konrad Hinsen)

Hi Matthew,

You're assuming that we know who Konrad Hinsen is.  *grin* Ok, when you're
referring to the least-squared code, are you referring to a module in
Scientific Python?

Please point us to the code that you're running, and we can give better
help.  Most of us are just learning Python, so many of us may not really
be familiar with the tools that you are using.  Make it easier for us to
help you!  *grin*

> but I would like to only have positive values returned. Can anyone help
> is altering the code to do this?

This shouldn't be too bad: it sounds like a filtering of the return value
is what you're looking for.  Show us a concrete example of what you're
getting so far, and what you want to get.

Good luck to you.

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

RE: [Tutor] least squares

2004-12-17 Thread Danny Yoo

On Thu, 16 Dec 2004, mdcooper wrote:

> I am trying to get a corrolation between a large number of variables and for
> many similar equations :
>(Ca1 * xa^2) + (Ca2 * ya^2) + (Ca3 * za^2) + ... = ta
>(Cb1 * xb^2) + (Cb2 * yb^2) + (Cb3 * zb^2) + ... = tb
>
> which is solved to get:
>(C1 * x^2) + (C2 * y^2) + (C3 * z^2) + ... = t
>
> where the submitted values of x,y, and z should give the correct t
>
> and I am using the code to get C1, C2, C3,  These constants are not
> allowed to be negative and the submitted equations are such that there
> is no reason for the values to be negative, and although a good fit may
> allow them to be negative, another good fit will be found if they are
> all positive.

Hi Matt,

Ok.  I have to admit that my mathematical maturity is actually a little
low, but I'll try to follow along.  But this really doesn't sounds like a
problem specific to Python though, but more of a math problem.

So you may actually want to talk with folks with a real math background; I
would not be mathematically mature enough to know if your code is correct.
I'd strongly recommend talking to folks on the 'sci.math' or
'sci.math.num-analysis' newsgroups.

I haven't seen your code, so I'm still in the dark about how it works or
what its capabilities are.  I'll make a few assumptions that might not be
right, but I have to start somewhere.  *grin*

Does your code provide a way at getting at all possible solutions, or only
one particular solution?  If so, then you can just filter out for
solutions that satisfy the property you want.

For example, we can produce a function that generates all squares in the
world:

###
>>> import itertools
>>> def allSquares():
... for i in itertools.count():
... yield i*i
...
>>> squares = allSquares()
>>> squares.next()
0
>>> squares.next()
1
>>> squares.next()
4
>>> squares.next()
9
>>> squares.next()
16
>>> squares.next()
25
###

I can keep calling 'next()' on my 'squares' iteration, and keep getting
squares.  Even though this can produce an infinite sequence of answers, we
can still apply a filter on this sequence, and pick out the ones that are
"palindromic" in terms of their digits:

###
>>> def palindromic(n):
..."Returns true if n is 'palindromic'."
...return str(n) == str(n)[::-1]
...
>>> filteredSquares = itertools.ifilter(palindromic, allSquares())
>>> filteredSquares.next()
0
>>> filteredSquares.next()
1
>>> filteredSquares.next()
4
>>> filteredSquares.next()
9
>>> filteredSquares.next()
121
>>> filteredSquares.next()
484
>>> filteredSquares.next()
676
###

This sequence-filtering approach takes advantage of Python's ability to
work on a iteration of answers.  If your program can be written to produce
an infinite stream of answers, and if a solution set with all positive
coefficients is inevitable in that stream, then you can take this
filtering approach, and just capture the first solution that matches your
constraints.

Similarly, if your program only produces a single solution, does it do so
through an "iterative" algorithm?  By iterative, I mean: does it start off
with an initial guess and apply a process to improve that guess until the
solution is satisfactory?

For others on the Tutor list, here is an "iterative" way to produce the
square root of a number:

###
def mysqrt(x):
guess = 1.0  ## initial guess
while not goodEnough(guess, x):
guess = improve(guess, x)
return guess

def improve(guess, x):
"""Improves the guess of the square root of x."""
return average(guess, x / guess)

def average(x, y):
return (x + y) / 2.0

def goodEnough(guess, x):
"""Returns true if guess is close enough to the square root of x."""
return abs(guess**2 - x) < 0.1
###

(adapted/ripped off from material in SICP:
http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-10.html#%_sec_1.1.7)

If your program tries to solve the problem through an iterative process,
then, again, does it inevitably produce a solution where all the constant
coefficients 'Cn' are positive?  If so, maybe you can just continue to
produce better and better solutions until its satisfactory, until all the
coefficients are positive.

Otherwise, without looking at your code, I'm stuck.  *grin* And even if I
do see your code, I might be stuck still.  If your code is short, feel
free to post it up, and we'll see how far we can get.  But you really may
want to talk with someone who has a stronger math background.

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] dbcp module

2004-12-17 Thread Danny Yoo

On Fri, 17 Dec 2004, Rene Bourgoin wrote:

> Yes i believe im looking for the python version of the Jakarta databse
> connection pool!!

Hi Rene,

I haven't looked at this too closely yet, but there are projects out there
for connection pools.  For example:

http://sqlrelay.sourceforge.net/

Some prominent Python projects, though, appear to use their own homebrewed
connection pools.  Zope appears to do this:

http://zdp.zope.org/projects/zfaq/faq/DatabaseIntegration/954522163

SQLObject maintains its own database pool:

http://wiki.sqlobject.org/connections

but also refers to 'DBPool.py':

http://jonpy.sourceforge.net/dbpool.html

I'm not sure if one database pooling solution has emerged as a dominant
one yet, though.

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] dbcp module

2004-12-17 Thread Danny Yoo

On Fri, 17 Dec 2004, Rene Bourgoin wrote:

> Ive been learning to interact with databases using python and i was
> looking for ways to return a SELECT query result in a plain format. what
> i mean by plain format is :
>
> name numberaddress
> Fred Smith   2125553243 1 main st
>
> All the pratices ive done return the results in tuples or tuples within 
> tuples.
> (('fred smith','2125553243','1 main st'))
>

> I saw some examples on activestate that use the dbcp module

Hi Rene,

Ok, let's pause for a moment.

I think I understand where all the confusion is coming from: it's a
namespace issue, as well as a case of really really bad naming.

You mentioned earlier that:

> Yes i believe im looking for the python version of the Jakarta
> database connection pool

However, that is probably not what you're looking for.  'dbcp' in the
context of the recipe that you've shown us:

> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/81189

has nothing to do with connection pools!

The author of that recipe has unfortunatetly named their module so as to
make it easy to confuse it with the Apache Commons DBCP project:

http://jakarta.apache.org/commons/dbcp/

But this is NOT the same 'dbcp' thing that the Python Cookbook recipe is
talking about.

The 'dbcp' Python Cookbook module refers to this snippet of code at the
beginning of the recipe:

##
"""This is dbcp.py, a module for printing out a cursor's output."""
def pp(cursor, data=None, rowlens=0):
d = cursor.description
if not d:
return " NO RESULTS ###"
names = []
lengths = []
rules = []
if not data:
t = cursor.fetchall()
for dd in d:# iterate over description
l = dd[1]
if not l:
l = 12 # or default arg ...
l = max(l, len(dd[0])) # handle long names
names.append(dd[0])
lengths.append(l)
for col in range(len(lengths)):
if rowlens:
rls = [len(str(row[col])) for row in data if row[col]]
lengths[col] = max([lengths[col]]+rls)
rules.append("-"*lengths[col])
format = " ".join(["%%-%ss" % l for l in lengths])
result = [format % tuple(names)]
result.append(format % tuple(rules))
for row in data:
result.append(format % row)
return "\n".join(result)
##

So I think the confusion here is just more anecdotal support to how much a
short, badly named variable name can damage a program.  What bothers me is
that the code in the recipe itself shows a disregard for good variable
names.  What the heck does 't', 'pp', 'dd', or 'l' stand for, anyway?
*grin*

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] dbcp module

2004-12-17 Thread Danny Yoo

On Fri, 17 Dec 2004, Kent Johnson wrote:

> The recipe you cite has the pp() function and an example of its use. It
> sounds like that is what you want.

Part of the pandemonium was my fault.  I completely missed your earlier
post here:

http://mail.python.org/pipermail/tutor/2004-December/034107.html

where, if I had been reading your initial response more closely, I would
have been able to catch the real reason why Rene and I were getting so
confused.  Sorry about that.

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] maximum value in a Numeric array

2004-12-10 Thread Danny Yoo



On Fri, 10 Dec 2004, Ertl, John wrote:

> I am trying to get the maximum value in a 2-D array.  I can use max but
> it returns the 1-D array that the max value is in and I then I need to
> do max again on that array to get the single max value.
>
> There has to be a more straightforward way...I have just not found it.
>
> >>> b = array([[1,2],[3,4]])
> >>> max(b)
> array([3, 4])
> >>> c = max(b)
> >>> max(c)
> 4

Hi John,

According to:

http://stsdas.stsci.edu/numarray/numarray-1.1.html/node35.html#l2h-108

you can use the 'max()' method of an array:

###
>>> import numarray
>>> b = numarray.array([[1,2],[3,4]])
>>> b.max()
4
###


Hope this helps!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

[Tutor] Fwd: python tutor needed

2010-04-21 Thread Danny Yoo

Forwarding for Lumka:

-- Forwarded message --
From: Lumka Msibi 
Date: Tue, Apr 20, 2010 at 2:51 PM
Subject: Fwd: python tutor needed
To: Danny Yoo , "dany.yoo" 

This
communication is intended for the addressee only. It is confidential.
If you have received this communication in error, please notify us
immediately and destroy the original message. You may not copy or
disseminate this communication without the permission of the
University. Only authorized signatories are competent to enter into
agreements on behalf of the University and recipients are thus advised
that the content of this message may not be legally binding on the
University and may contain the personal views and opinions of the
author, which are not necessarily the views and opinions of The
University of the Witwatersrand, Johannesburg. All agreements between
the University and outsiders are subject to South African Law unless
the University agrees in writing to the contrary.

-- Forwarded message --
From: Lumka Msibi 
To: Danny Yoo , "dany.yoo" 
Date: Tue, 20 Apr 2010 20:51:41 +0200
Subject: Fwd: python tutor needed

-- Forwarded message --
From: Lumka Msibi 
To: Danny Yoo 
Date: Mon, 19 Apr 2010 16:25:19 +0200
Subject: python tutor needed
Hi

Are there any python tutors in Johannesburg, South Africa? i really
need some tution before my exam in 2 weeks. please help.

thank you
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python 2.4 IDLE Windows 2000

2004-12-20 Thread Danny Yoo

On Sat, 18 Dec 2004, Jacob S. wrote:

> I think I'm going to burst out in tears. Mike Hansen gave the solution
> to the very problem I'm having, yet, when I used the console version and
> deleted the custom color theme, it stopped giving me error messages, but
> it still won't start up without the console.
>
> I'm am still stunted of my python 2.4 experience.

Hi Jacob,

Mike's bug submission into Sourceforge is getting attention; it's bug
1080387:

http://sourceforge.net/tracker/index.php?func=detail&aid=1080387&group_id=5470&atid=105470

I'm still concerned, though, about your case, since you're saying that
there are no error messages, but it's still not starting up from the icon
in your Start menu.

There should be a '.idlerc' directory in your home directory; can you try
moving or renaming it somewhere else temporarily?  Afterwards, try
starting IDLE.

But don't wipe that '.idlerc' out yet, because if it turns out that this
actually works, then we should take a closer look at the .idlerc
directory, and send it off to the bug hunters.

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Matrix

2004-12-20 Thread Danny Yoo

On Sun, 19 Dec 2004, Bugra Cakir wrote:

> I want to create a matrix in Python. For example 3x4 how can i
> create this? thanks

Hi Bugra,

Just for reference, here's the relevant FAQ about how to do matrices in
Python:

http://python.org/doc/faq/programming.html#how-do-i-create-a-multidimensional-list

If you're planning to do a lot of matrix-y stuff, you may want to look
into the 'numarray' third-module:

http://www.stsci.edu/resources/software_hardware/numarray

The package adds powerful matrix operations to Python.

Good luck to you!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python 2.4 IDLE Windows 2000

2004-12-20 Thread Danny Yoo

On Mon, 20 Dec 2004, Jacob S. wrote:

> 1) Delete .idlerc directory
> 2) Uninstall and reinstall python2.4

Hi Jacob,

Ok, so there was probably something really screwy with the stuff in
.idlerc.

Can you send me a tarball or zip of it in private email?  I just want to
make sure that the issue that you were hitting is the same as Mike's.  If
this is something new, we'd better make sure the IDLE developers know that
it's something other than a previous color configuration that is causing
IDLE to break.

Talk to you later!

___
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Comments appreciated

2004-12-27 Thread Danny Yoo

On Sun, 26 Dec 2004, Jacob S. wrote:

> > The only thing that's missing is that this script can't handle paths
> > like ~/dir/junkthis
>
> I believe you're looking for os.path.expanduser("~/dir/junkthis")
>
> BTW, trashcan IS a module level variable because it's defined at the module
> level. Why it says it's local is beyond me.

Hi Jacob,

Ah, you must be running into the global/local gotcha.  This one is
notorious.  Let's work with a few simplified examples to clear up the
situation.

For the sake of simplification, my variable names will suck.  *grin*

Anyway, here's the first example:

###
y = 0
def inc(x):
y = x + 1
return y
###

We know that when we assign to variables in functions, those variables are
locally scoped.  This is a fairly unique feature in Python: to assign to a
variable is the same thing as declaration in the Python language.

So in the example above, 'y' in the line:

y = x + 1

is a local variable, since it's assigned within the function.  And if we
play with this in the interactive interpreter, we'll see that calling
inc() doesn't touch the value of the module-level 'y' variable.

###
>>> y = 0
>>> def inc(x):
... y = x + 1
... return y
...
>>> inc(3)
4
>>> y
0
>>> y = 2
>>> inc(7)
8
>>> y
2
###

Hope that made sense so far: assignment in a function causes the assigned
variable to be treated as local to that function.

Now let's look at a slightly different example which highlights the
problem:

###
y = 42

def inc_y():   ## buggy
y = y + 1
###

The issue is the one brought up by the first example: assignment causes a
variable to be local.  But if 'y' is local, then what does:

y = y + 1

mean, if 'y' hasn't been assigned to yet?

###
>>> y = 42
>>> def inc_y():
... y = y + 1
...
>>> inc_y()
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 2, in inc_y
UnboundLocalError: local variable 'y' referenced before assignment
###

And this is actually something that could --- and probably should! --- be
handled at "definition" time rather than at "run" time.  We really
shouldn't have to call inc_y() to know that something's already wrong with
it.

[Advanced: In general, it'd be impossible to do it in full generality,
since Python's behavior is so dynamic.  As a hideous example:

###
y = 42
def degenerate_case():
if some_strange_runtime_condition_was_true():
y = 42
y = y + 1
###

Depending on the definition of some_strange_runtime_condition_was_true(),
we'd either die and get a UnboundLocalError, or the function would run to
completion.  But in this case, I would promptly yell at anyone who would
write code like this.  *grin*]

Anyway, to fix the problem, that is, to be able to write a function that
does an assignment to a global, we then have to declare the variable as
'global':

###
y = 42

def inc_y_fixed():
global y ## fix for scoping issue
y = y + 1
###

so that Python knows up front that 'y' in inc_y_fixed is meant to refer to
the module-level 'y'.

All of this is a consequence of Python's approach to variable declaration.
For the common case, we get to save a lot of lines, since we don't have to
write things like

var x, y, z
int x;

at the head of every one of our functions.  But, consequently, we do hit
this slightly odd and ugly situation when we want to assign to
module-level variables.

For more information, you may want to look at Tim Peters's nice FAQTS
entry on this subject, linked here:

http://www.faqts.com/knowledge_base/view.phtml/aid/4722/fid/241

This is also in AMK's "Python Warts" page:

http://www.amk.ca/python/writing/warts.html

under the header "Local Variable Optimization".

If you have more questions, please feel free to ask.  Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Comments appreciated

2004-12-27 Thread Danny Yoo



[Jacob]
> > BTW, trashcan IS a module level variable because it's defined at the
> > module level. Why it says it's local is beyond me.

[Danny]
> Ah, you must be running into the global/local gotcha.

[long rambling text cut]

Wait, wait.  Forget everything I said.  *grin* I should have read the
question more closely before going crazy with the explanation.  I think I
screwed up again.  Luis, please forgive me for not reading that more
closely.


Let me read Luis's message again... ok.  I think I understand where I got
confused.  Let's start over.  Let me write another rambling post.


Let's repeat the conversation, starting from Jeff's response:

[Jeff]
> Also, even though this is intended to be a quick shell script, it's
> not a bad idea to make everything except function defs into a little
> main() function, and call it in a script-only section.


[Luis]

> The only thing I'm not clear about is how 'trashcan' can be a
> local variable inside main() when it's required by both trash() and
> can()


What Jeff is trying to say is that it's possible to pass 'trashcan' around
as yet another parameter to both trash() and can().  That is, we can avoid
global variables altogether, and just work with parameter passing.


As a concrete example, the following program:

###
password = "elbereth"

def guess():
msg = raw_input("try to guess: ")
if msg == password:
print "You got it!"
else:
print "nope"

def main():
guess()

if __name__ == '__main__':
main()
###


can be transformed so that it uses no global variables.  We can do this by
making guess() take in 'password' as a parameter too:

###
def guess(password):
msg = raw_input("try to guess: ")
if msg == password:
print "You got it!"
else:
print "nope"

def main():
pword = 'elbereth'
guess(pword)

if __name__ == '__main__':
main()
###

No more globals.  The password is now explicitely passed between the
main() and the guess().


An advantage here is that this guess() function is less tied to outside
global resources, and is, in theory, more easily reused.  If we wanted to
rewrite the program so that we take three different passwords, the version
without global variables is easy to write:

guess("elbereth")
guess("open sesame")
guess("42")

But the one with the global variable use is something much uglier on our
hands:

password = "elbereth"
guess()
password = "open sesame"
guess()
password = "42"
guess()


Anyway, sorry about the confusion.  I have to read threads more carefully!


Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Parsing a block of XML text

2004-12-31 Thread Danny Yoo

On Fri, 31 Dec 2004, kumar s wrote:

> I am trying to parse BLAST output (Basic Local Alignment Search Tool,
> size around more than 250 KB ).

[xml text cut]

Hi Kumar,

Just as a side note: have you looked at Biopython yet?

http://biopython.org/

I mention this because Biopython comes with parsers for BLAST; it's
possible that you may not even need to touch XML parsing if the BLAST
parsers in Biopython are sufficiently good.  Other people have already
solved the parsing problem for BLAST: you may be able to take advantage of
that work.

> I wanted to parse out :
>
>
>   

Ok, I see that you are trying to get the content of the High Scoring Pair
(HSP) query and hit coordinates.

> I wrote a ver small 4 line code to obtain it.
>
> for bls in doc.getElementsByTagName('Hsp_num'):
>   bls.normalize()
>   if bls.firstChild.data >1:
>   print bls.firstChild.data

This might not work.  'bls.firstChild.data' is a string, not a number, so
the expression:

bls.firstChild.data > 1

is most likely buggy.  Here, try using this function to get the text out
of an element:

###
def get_text(node):
"""Returns the child text contents of the node."""
buffer = []
for c in node.childNodes:
if c.nodeType == c.TEXT_NODE:
buffer.append(c.data)
return ''.join(buffer)
###

(code adapted from: http://www.python.org/doc/lib/dom-example.html)

For example:

###
>>> doc = xml.dom.minidom.parseString("helloworld")
>>> for bnode in doc.getElementsByTagName('b'):
... print "I see:", get_text(bnode)
...
I see: hello
I see: world
###

> Could any one help me directing how to get the elements in that tag.

One way to approach structured parsing problems systematically is to write
a function for each particular element type that you're trying to parse.

>From the sample XML that you've shown us, it appears that your document
consists of a single 'Hit' root node.  Each 'Hit' appears to have a
'Hit_hsps' element.  A 'Hit_hsps' element can have several 'Hsp's
associated to it.  And a 'Hsp' element contains those coordinates that you
are interested in.

More formally, we can structure our parsing code to match the structure
of the data:

### pseudocode ###
def parse_Hsp(node):
## get at the Hit_hsps element, and call parse_Hit_hsps() on it.

def parse_Hit_hsps(node):
## get all of the Hsp elements, and call parse_Hsp() on each one of
## them.

def parse_Hsp(node):
## extract the query and hit coordinates out of the node.
##

To see another example of this kind of program structure, see:

http://www.python.org/doc/lib/dom-example.html

Please feel free to ask more questions.  Good luck to you.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Parsing a block of XML text

2005-01-01 Thread Danny Yoo

On Fri, 31 Dec 2004, kumar s wrote:

> http://www.python.org/doc/lib/dom-example.html
>
> Frankly it looked more complex. could I request you to explain your
> pseudocode. It is confusing when you say call a function within another
> function.

Hi Kumar,

A question, though: can you try to explain what part feels weird about
having a function call another function?

Is it something specific to XML processing, or a more general problem?
That is, do you already feel comfortable with writing and using "helper"
functions?

If you're feeling uncomfortable with the idea of functions calling
functions, then that's something we should probably concentrate on,
because it's really crucial to use this technique, especially on
structured data like XML.

As a concrete toy example of a function that calls another function, we
can use the overused hypotenuse function.  Given right triangle leg
lengths 'a' and 'b', this function returns the length of the hypotenuse:

###
def hypotenuse(a, b):
return (a**2 + b**2)**0.5
###

This definition works, but we can use helper functions to make the
hypotenuse function a little bit more like English:

###
def sqrt(x):
return x ** 0.5

def square(x):
return x * x

def hypotenuse(a, b):
return sqrt(square(a) + square(b))
###

In this variation, the rewritten hypotenuse() function uses the other two
functions as "helpers".  The key idea is that the functions that we write
can then be used by anything that needs it.

Another thing that happens is that hypotenuse() doesn't have to know how
sqrt() and square()  are defined: it just depends on the fact that sqrt()
and square() are out there, and it can just use these as tools.  Computer
scientists call this "abstraction".

Here is another example of another "helper" function that comes in handy
when we do XML parsing:

###
def get_children(node, tagName):
"""Returns the children elements of the node that have this particular
tagName.  This is different from getElementsByTagName() because we
only look shallowly at the immediate children of the given node."""
children = []
for n in node.childNodes:
if n.nodeType == n.ELEMENT_NODE and n.tagName == tagName:
children.append(n)
return children
###

For example:

###
>>> import xml.dom.minidom
>>> dom = xml.dom.minidom.parseString("helloworld")
>>>
>>> dom.firstChild

>>>
>>> get_children(dom.firstChild, "a")
[, ]
###

> It is confusing when you say call a function within another function.

Here's a particular example that uses this get_children() function and
that get_text() function that we used in the earlier part of this thread.

###
def parse_Hsp(hsp_node):
"""Prints out the query-from and query-to of an Hsp node."""
query_from = get_text(get_children(hsp_node, "Hsp_query-from")[0])
query_to = get_text(get_children(hsp_node, "Hsp_query-to")[0])
print query_from
print query_to
###

This function only knows how to deal with Hsp_node elements.  As soon as
we can dive through our DOM tree into an Hsp element, we should be able to
extract the data we need.  Does this definition of parse_Hsp() make sense?

You're not going to be able to use it immediately for your real problem
yet, but you can try it on a sample subset of your XML data:

###
sampleData = """

1
1164.13
587
0
1
587

"""
doc = xml.dom.minidom.parseString(sampleData)
parse_Hsp(doc.firstChild)
###

to see how it works so far.

This example tries to show the power of being able to call helper
functions.  If we were to try to write this all using DOM primitives, the
end result would look too ugly for words.  But let's see it anyway.
*grin*

###
def parse_Hsp(hsp_node):  ## without using helper functions:
"""Prints out the query-from and query-to of an Hsp node."""
query_from, query_to = "", ""
for child in hsp_node.childNodes:
if (child.nodeType == child.ELEMENT_NODE and
child.tagName == "Hsp_query-from"):
for n in child.childNodes:
if n.nodeType == n.TEXT_NODE:
query_from += n.data
if (child.nodeType == child.ELEMENT_NODE and
child.tagName == "Hsp_query-to"):
for n in child.childNodes:
if n.nodeType == n.TEXT_NODE:
query_to += n.data
print query_from
print query_to
###

This is exactly the kind of code we want to avoid.  It works, but it's so
fragile and hard to read that I just don't trust it.  It just burns my
eyes.  *grin*

By using "helper" functions, we're extending Python's vocabulary of
commands.  We can then use those functions to help solve our problem with
less silliness.  This is a reason why knowing how to write and use
functions is key to learning how to program: this principle applies
regardless of what particular programming language we're using.

If you have questions on any of this, please feel free to ask.  Good luck!

___
Tuto

Re: [Tutor] O.T.

2005-01-01 Thread Danny Yoo

On Thu, 30 Dec 2004, Anna Ravenscroft wrote:

> Anna Martelli Ravenscroft
> 42, 2 children (13 and 11) live with their dad
> Married this July to the martelli-bot (we read The Zen of Python at our
> wedding!). We currently live in Bologna, Italy.

Hi Anna,

Congratulations!  Say hi to Alex for me; it was a pleasure to see him a
few days ago in Palo Alto.

I'm 25 years old.  I don't have any children yet.  I'm currently working
at the Carnegie Institution of Washington, and appear somewhere on the
staff page here:

http://carnegiedpb.stanford.edu/dpb/dpb.php

*grin*

I'm currently planning to run away from work, and attend graduate school
soon.  But I'll try not to disappear from Python-Tutor; it's been too much
fun for me to give it up.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] How to put my functions in an array

2005-01-01 Thread Danny Yoo

On Sat, 1 Jan 2005, Jacob S. wrote:

> funct = {'Add Virt':addvirt,'Remove Virt':remvirt,'More
>   Stuff':more,"Extras":extra}
> def addvirt():
> pass
> def remvirt():
> pass
> def more():
> pass

Hi Jacob,

Quick gotcha note: the definition of the 'funct' dictionary has to go
after the defintion of the hander functions: otherwise, Python has no clue
what 'addvirt' and 'remvirt' are.  Don't worry: it happens to me too.
*grin*

Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Am I storeing up problems ?

2005-01-02 Thread Danny Yoo

On Sun, 2 Jan 2005, Dave S wrote:

> My matrix 'self.data' consists of a list of 110 items, each item is a
> dictionary of 250 keys, each key holds two lists, one of four items, one
> of 12 items.

Hi Dave,

Hmmm... what kind of data is being copied here?  Python's data structures
are desigined for flexibility, but sometimes that flexibility comes at the
cost of some space.  If we knew more about the data that's being stored,
maybe we can think of a more compact representation.

> I needed to copy this matrix to 'self.old_data', so I have been using
> .deepcopy(), which works OK but is SLOW  (12+ secs)

Perhaps we can avoid making a separate copy of the data?  Copying data
structures can often be avoided.  Why does your program try to copy the
data structure?

(Aside: one nonobvious example where copying can be avoided is in Conway's
Game of Life:  when we calculate what cells live and die in the next
generation, we can actually use the 'Command' design pattern to avoid
making a temporary copy of the world.  We can talk about this in more
detail if anyone is interested.)

> Once the matrix is copied, 'self.data' is re-constructed as the
> programme gathers data.
>
> To speed things up I changed my code to
>
> # This is the speeded up deepcopy()
> self.old_data = self.data
> self.data = []
> for i in range(110):
> self.data.append({})
>
>  Query: Is all this OK, it works and quick as well, but I am concerned I
> may be leaving garbage in the Python PVM since I am shrugging off old
> 'self.old_data's which may be mounting up ?

Whenever we reassign to self.old_data:

self.old_data = self.data

then whatever self.old_data was pointing at, before the assignment, should
get garbage collected, if there are no other references to the old data.

The code there isn't really doing any copying at all, but is rather just
constructing a whole new empty data structure into 'self.data'.  If so,
then it does seem that you can avoid copying.

I'd love to know a little bit more about the data that the program is
collecting; if you have time, please feel free to tell us more details.
Good luck to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

[Tutor] The Game of Life

2005-01-03 Thread Danny Yoo

On Mon, 3 Jan 2005, Brian van den Broek wrote:

> >> (Aside: one nonobvious example where copying can be avoided is in
> >> Conway's Game of Life:  when we calculate what cells live and die in
> >> the next generation, we can actually use the 'Command' design pattern
> >> to avoid making a temporary copy of the world.  We can talk about
> >> this in more detail if anyone is interested.)
> >
> Seconded. (Thanks for offering, Danny.)

[Long post up ahead.]

Just as a warning, none of what I'm going to code here is original at all:
I'm rehashing a main idea off of a paper called "Using the Game of Life to
Introduce Freshman Students to the Power and Elegance of Design Patterns":

http://portal.acm.org/citation.cfm?id=1035292.1028706

Just for reference, the Game of Life is described in Wikipedia here:

http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life

Ok, let's start hacking.  The Game of Life involves a 2-D matrix of cells,
where any cell can be 'LIVE' or 'DEAD'.

###
LIVE, DEAD = '*', '.'
###

And just to make this quick-and-dirty, let's simulate this matrix with a
dictionary whose keys are 2-tuples (x, y), and whose values are either
LIVE or DEAD.  To reduce the complexity of the example, I'll even avoid
using classes or data abstraction.  *grin*

Here's our setup:

###

def make_random_world(M, N):
"""Constructs a new random game world of size MxN."""
world = {}
for j in range(N):
for i in range(M):
world[i, j] = random.choice([LIVE, DEAD])
world['dimensions'] = (M, N)
return world

def print_world(world):
"""Prints out a string representation of a world."""
M, N = world['dimensions']
for j in range(N):
for i in range(M):
print world[i, j],
print
###

(If I had more space, I'd rather do this as a real data structure instead
of hacking up a dictionary.)

For example:

###
>>> print_world(make_random_world(10, 10))
. * * * . . * * . *
. . * * . . * . * .
. * * . . * . * * *
* * . * * * * * . *
. . * . * * * * . .
* * . . . * * . * *
* . * . . * * * . .
* . * * * . . * . .
* * . . . * * . . .
. * . . * . . * * *
>>>
>>>
>>> small_world = make_random_world(4, 4)
>>> print_world(small_world)
* . . *
* * . .
. * * *
. . . .
###

For the example ahead, I'll use the small world.  Ok, this looks good so
far.

So now we have something that shows a single generation of a game world.
How does the world run?  Well, between each generation, each cell can
either live or die according to the rule:

A dead cell with exactly 3 live neighbors becomes alive (or is
"born").

A live cell with 2 or 3 live neighbors stays alive; otherwise it dies
(from "loneliness").

For example, when we look back at the small world:

###
* . . *
* * . .
. * * *
. . . .
###

The cell at the upper left corner (0, 0) has 2 neighbors, so it'll
survive.  The cell at the upper right corner (3, 0) has no neighbors, so
it'll die.  The cell at the bottom left (0, 3) has one neighbor, so it
stays dead, but the cell near the bottom right (2, 3) has three neighbors,
so it'll spring to life.

Let's code these up.

###
def get_state(world, i, j):
"""Returns the state of the cell at position (i, j).  If we're out
of bounds, just returns DEAD to make calculations easier."""
return world.get((i, j), DEAD)

def count_live_neighbors(world, i, j):
"""Returns the number of live neighbors to this one."""
live_count = 0
for i_delta in [-1, 0, 1]:
for j_delta in [-1, 0, 1]:
if (i_delta, j_delta) == (0, 0):
continue
if get_state(world, i+i_delta, j+j_delta) == LIVE:
live_count += 1
return live_count

def is_born(world, i, j):
"""Returns True if the cell at (i, j) is currently dead, but will
be born in the next generation."""
return (get_state(world, i, j) == DEAD and
count_live_neighbors(world, i ,j) == 3)

def is_deceased(world, i, j):
"""Returns True if the cell at (i, j) is currently alive, but will
die in the next generation."""
return (get_state(world, i, j) == LIVE and
not (2 <= count_live_neighbors(world, i ,j) <= 3))
###

And again, let's make sure these rules work by just spot checking them:

###
>>> is_deceased(small_world, 0, 0)
False
>>> is_deceased(small_world, 3, 0)
True
>>> is_born(small_world, 0, 3)
False
>>> is_born(small_world, 2, 3)
True
###

Ok, at least it's not blatently buggy.  *grin*

Now the problem is: given the current state of the world, how do we
calculate the next state of the world?  One way is to make a temporary
copy of the world, and apply changes to it, consulting the original
world for the rules:

###
def apply_next_generation(world):
"""Destructively mutate the world so it reflect the next
generation."""
M, N = world['dimensions']
new_world = copy.copy(world)
for j in range(N):
for i in range(M):
if is_born(world, i, j):

Re: [Tutor] regex problem

2005-01-04 Thread Danny Yoo

On Tue, 4 Jan 2005, Michael Powe wrote:

> def parseFile(inFile) :
> import re
> bSpace = re.compile("^ ")
> multiSpace = re.compile(r"\s\s+")
> nbsp = re.compile(r" ")
> HTMLRegEx =
> 
> re.compile(r"(<|<)/?((!--.*--)|(STYLE.*STYLE)|(P|BR|b|STRONG))/?(>|>)
> ",re.I)
>
> f = open(inFile,"r")
> lines = f.readlines()
> newLines = []
> for line in lines :
> line = HTMLRegEx.sub(' ',line)
> line = bSpace.sub('',line)
> line = nbsp.sub(' ',line)
> line = multiSpace.sub(' ',line)
> newLines.append(line)
> f.close()
> return newLines
>
> Now, the main issue I'm looking at is with the multiSpace regex.  When
> applied, this removes some blank lines but not others.  I don't want it
> to remove any blank lines, just contiguous multiple spaces in a line.

Hi Michael,

Do you have an example of a file where this bug takes place?  As far as I
can tell, since the processing is being done line-by-line, the program
shouldn't be losing any blank lines at all.

Do you mean that the 'multiSpace' pattern is eating the line-terminating
newlines?  If you don't want it to do this, you can modify the pattern
slightly.  '\s' is defined to be this group of characters:

'[ \t\n\r\f\v]'

(from http://www.python.org/doc/lib/re-syntax.html)

So we can adjust our pattern from:

r"\s\s+"

to

r"[ \t\f\v][ \t\f\v]+"

so that we don't capture newlines or carriage returns.  Regular
expressions have a brace operator for dealing with repetition:
if we're looking for at least 2 or more
of some thing 'x', we can say:

x{2,}

Another approach is to always rstrip() the newlines off, do the regex
processing, and then put them back in at the end.

There are some assumptions that the program makes about the HTML that you
might need to be careful of.  What does the program do if we pass it the
following string?

###
from StringIO import StringIO
sampleFile = """
hello world!
"""
###

Issues like these are already considered in the HTML parser modules in the
Standard Library, so if you can use HTMLParser, I'd strongly recommend it.

Good luck to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] The Game of Life

2005-01-05 Thread Danny Yoo



> > Just as a warning, none of what I'm going to code here is original at
> > all: I'm rehashing a main idea off of a paper called "Using the Game
> > of Life to Introduce Freshman Students to the Power and Elegance of
> > Design Patterns":
> >
> > http://portal.acm.org/citation.cfm?id=1035292.1028706
>
> Uh, with all respect Danny, this is a nice example of the Command
> pattern, but to me this example would better be called "Using the Power
> and Elegance of Design Patterns to Complicate the Game of Life".


Hi Kent,

Yeah, in retrospect, I see that the design patterns there might be
overkill.


There seems to be a fashionable push to introduce patterns early on in
computer science education, perhaps because they are easy to put in as
test questions.

But despite this, I do think that there are some patterns that are worth
seeing, even if they are in unrealistic toy situations.  I got a kick out
of seeing how 'Command' was applied in the Life example, because it shows
that we can store active actions as data.


> Why not just build a new world with the values of the next generation in
> it, and return that from apply_next_generation?

That also works, but it doesn't fit the function's description.  The
example that I adapted originally wanted a function that mutated the
previous generation, so that's what I stuck with.


> The copying is not needed at all and the command pattern is introduced
> to solve a non-problem.

[code cut]

> The caller would have to change slightly to accept the new world
> returned from the function, but there is no unnecessary copying.


Sure.  I didn't think of this before, but the two approaches can reflect
different ways that people can represent data that changes through time:

o.  Making changes on a "scratch-space" copy of the data.

o.  Maintaining a "diff" change log.

The approach with the Command pattern seems close to a diff-oriented view
of applying changes to a model.


If the Life example sucked, don't blame me too badly: I'm just the
translator.  *grin*


Talk to you later!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] The Game of Life question

2005-01-05 Thread Danny Yoo

On Wed, 5 Jan 2005, Kooser, Ara S wrote:

>They picked a project to model the flow of smallpox in a city and
> surroundings areas. So I saw the game of life and thought maybe they
> could modify it for use as a smallpox model.

Hi Ara,

Oh!  My apologies for not posting the file as a complete example!  Here,
let me just link it so that you don't have to type it or cut-and-paste it
all out manually:

http://hkn.eecs.berkeley.edu/~dyoo/python/life.py

Please note that I wrote that code in a hacking session, so it might not
make complete sense!  *grin* I would hate to inflict it on students
without having the code cleaned up a bit first.

I'm not sure how easily my Life example can be adopted to simulate disease
transmission.  Let me do some research... Ok, there's a paper on
epidemiology and cellular automata here:

http://arxiv.org/abs/nlin.CG/0403035

Someone appears to have written a disease model in StarLogo here:

http://biology.wsc.ma.edu/biology/courses/concepts/labs/epidemiology/

where they refer to a program called HBepidemic; it might be interesting
to see if we can get the source code to HBepidemic and translate it into
Python.

Whoa!  If you have Mathematica, take a look at:

http://www.brynmawr.edu/Acads/Chem/Chem321mf/index521.html

Workshop 6 on that page appears to have a cellular automaton example that
models an epidemic, and sounds really interesting!  Their worksheet also
has the full Mathematica code needed to get the model working.  I'm a
Mathematica newbie, but if you want, I think I can figure out how their
code works.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] automatically finding site-packages and python2.3 in a linux machine

2005-01-06 Thread Danny Yoo

On Thu, 6 Jan 2005, Fred Lionetti wrote:

> I'm working on creating an installer for my program using install
> shield, and I'd like to know how one can automatically determine if
> Python 2.3 is installed on a linux machine, and where site-packages is
> located (so that I can install my own files there).  For my Windows
> version I was able to search for the python2.3 entry in the windows
> registry, but I don't know how do the equivalent from linux.  Any ideas?

Hi Fred,

Yes, there are some undocumented functions in the Distutils package that
you can use to find where 'site-packages' lives.

Let me check... ah, ok, the function that you're probably looking for is
distutils.sysconfig.get_python_lib().  For example:

###
>>> distutils.sysconfig.get_python_lib()
'/usr/lib/python2.3/site-packages'
###

def get_python_lib(plat_specific=0, standard_lib=0, prefix=None):
"""Return the directory containing the Python library (standard or
site additions).

If 'plat_specific' is true, return the directory containing
platform-specific modules, i.e. any module from a non-pure-Python
module distribution; otherwise, return the platform-shared library
directory.  If 'standard_lib' is true, return the directory
containing standard Python library modules; otherwise, return the
directory for site-specific modules.

If 'prefix' is supplied, use it instead of sys.prefix or
sys.exec_prefix -- i.e., ignore 'plat_specific'.
"""

So you can use Python itself to introspect where the libraries should
live.  I hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] replacement for constants from other languages in Python?

2005-01-06 Thread Danny Yoo

On Thu, 6 Jan 2005, Alan Gauld wrote:

> > I'm _very_ used to using C style constants (preprocessor #define
> > directives) or C++ const keyword style, for a variety of reasons.
> >
> > I've yet to see anything covering 'how to work around the lack of
> > constants in Python'...can anyone point me in the right direction
> > here?

Hi Scott,

There are a few recipes in the Python Cookbook that mentions how to get a
"const" mechanism in Python:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65207
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/197965

But I have to admit that I don't use this approach myself; I've been using
the uppercase convension, and it seems to work ok.

Good luck to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] automatically finding site-packages and python2.3 in a linux machine

2005-01-06 Thread Danny Yoo



> > I'm working on creating an installer for my program using install
> > shield, and I'd like to know how one can automatically determine if
> > Python 2.3 is installed on a linux machine

Hi Fred,

Sorry about ignoring parts of your question!  Unix has default places for
putting binaries like Python.  Check the directories '/usr/bin/' and
'/usr/local/bin'.  Also, the 'which' command will also tell us where
Python is, if it's in the user's PATH:

###
[EMAIL PROTECTED] dyoo]$ which python
/usr/bin/python
###



> > and where site-packages is located (so that I can install my own files
> > there).  For my Windows version I was able to search for the python2.3
> > entry in the windows registry, but I don't know how do the equivalent
> > from linux.  Any ideas?
>
> Yes, there are some undocumented functions in the Distutils package that
> you can use to find where 'site-packages' lives.

I'm totally wrong about this.  It IS documented.  *grin* Here's a link to
the official documentation:

http://www.python.org/doc/dist/module-distutils.sysconfig.html

Sorry about that; I had expected to find it in the Library Reference, but
the Distutils stuff has its own separate documentation.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] trouble getting a list to update

2005-01-06 Thread Danny Yoo

On Thu, 6 Jan 2005, Vincent Wan wrote:

> I wrote a program to repeatedly:
>   a: print a list of integers eg 0 0 0 0 0 0 0 0 0 0
>   b: change each integer in the list to 1 with a .05% chance
>
> I run the program and over itterations more 1's appear as they should.
>
> However, the changes printed in step a don't seem to be the same as
> those made in step b (as shown by the debug code.

Hi Vincent,

Can you show us a snippet of the file output?  I'm not immediately seeing
anything particular with your debugging output statements:

myfile.write('base ' + str(current_base) + ' mutated to ')
myfile.write(str(a_genome[current_base]) + '\n')

Like the computer, I don't yet understand what the problem is.  *grin*

If you can point us at the output that looks weird to you, and tell us
what you expected to see instead and why, that will help us a lot to
better understand the situation.

Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] trouble getting a list to update

2005-01-06 Thread Danny Yoo

On Thu, 6 Jan 2005, Vincent Wan wrote:

> On Jan 6, 2005, at 12:59 PM, Danny Yoo wrote:
> > Can you show us a snippet of the file output?  I'm not immediately
> > seeing anything particular with your debugging output statements:
>
> > Like the computer, I don't yet understand what the problem is.  *grin*
>
> > If you can point us at the output that looks weird to you, and tell us
> > what you expected to see instead and why, that will help us a lot to
> > better understand the situation.

Hi Vincent,

Ok, now I understand and see the bug.  write_list() contains the problem:

###
def write_list(list_to_write, file_name):
 "Writes elements of a list, seperated by spaces, to a file"
 for each in list_to_write:
 file_name.write(str(list_to_write[each]) + ' ')
 file_name.write('\n')
###

The bug is here:

 file_name.write(str(list_to_write[each]) + ' ')

Do you see why this is buggy?  'each' is not an index into list_to_write,
but is itself an element of list_to_write.

You probably meant to write:

 file_name.write(str(each) + ' ')

Your list was mutating just fine: the output function simply was obscuring
what was happening.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem with a variable in a CGI program

2005-01-07 Thread Danny Yoo

On Fri, 7 Jan 2005, Mark Kels wrote:

> I started to read the following code (I will start working on it when
> this problem is fixed) and it looks OK while I read it. But for some
> reason it doesn't work...

Hi Mark,

Ok, let's take a look at the error message:

> Traceback (most recent call last):
>   File "C:\maillist.py", line 80, in ?
> useMailList.writelines(memMailList) # Copy internal mailing list to 
> file
>
> NameError: name 'memMailList' is not defined

Ok, good, so we know what we should look for: we should see where
'memMailList' is defined, and see if there's any way that the program flow
can route around the definition.

Ok, I see it, embedded within the context of a try/except block.

##
try:

## some code before definition of memMailList...

memMailList=useMailList.readlines()

## some more code cut...

finally:
   useMailList=open("maillist.txt", "w")
   useMailList.writelines(memMailList)
   useMailList.close
   print useFoot.read()
   useHead.close;
   useFoot.close;
##

There's one possibility that strikes me: it's very possible that in the
code before the definition of memMailList might raise an exception, or
exit out prematurely due to a sys.exit().  If that happens, then Python
will finish things up by jumping into the 'finally' block.  And if that
happens, then we haven't reached memMailList's definition, and the
NameError is inevitable.

Here's a simplified example of what might be happening:

###
>>> try:
... a = 42
... 5 / 0
... b = 17
... finally:
... print "a is", a
... print "b is", b
...
a is 42
b is
Traceback (most recent call last):
  File "", line 7, in ?
NameError: name 'b' is not defined
###

A direct way to debug this is to drop the 'finally' block altogether.  If
we do so, like this:

##
## some code before definition of memMailList...

memMailList=useMailList.readlines()

## some more code cut...

useMailList=open("maillist.txt", "w")
useMailList.writelines(memMailList)
useMailList.close
print useFoot.read()
useHead.close;
useFoot.close;
##

then the real cause of the problem should pop up quickly.

A few more comments:

1.  The calls to close() need parentheses.  Unlike some other programming
languages, function calls fire off only with parens.  Instead of:

useFoot.close

you probably want:

useFoot.close()

2.  I also see that the code is hardcoded to use Python 1.5.  Is there any
way you can use a more modern version of Python?  The reason I ask is that
recent versions of Python have really good modules for debugging CGI
programs, including the excellent 'cgitb' module:

http://www.python.org/doc/current/lib/module-cgitb.html

Writing CGI programs without 'cgitb' is almost criminal.  *grin*  If you
can, use it.

3.  The program is long.  Can you break it down into some functions?  The
program looks like one big monolith, and I'm actually quite terrified of
it.  *grin*

Small functions are much easier to debug, and more approachable for
maintainers of the code.  This is a style critique, but it does matter
because although computers don't care what code looks like, human
programmers do.

4.  Related to Comment Three is: there are way too many comments on the
side margin.  Comments like:

###
if email != email2:  # Checks to see if two entered
 # email addresses match
print "Email addresses do not match" # If no match print error
sys.exit(0)  # Exit script with error 0
elif password != password2:  # Checks to see if two
 # entered passwords match
print "Passwords do not match"   # If no match print error
sys.exit(0)  # Exit script with error 0
###

are redundant, since they basically parrot what the code is doing.

Concise comments that explain the reason why the code is doing what it's
doing can be more useful.  Comments are meant to show us authorial
intension: if we were interested in what the code was actually doing, we'd
just read the code.  *grin*

Something like:

##

## Check for valididated emails and passwords, and exit if either are
## invalid.
if email != email2:
print "Email addresses do not match"
sys.exit(0)
elif password != password2:
print "Passwords do not match"
sys.exit(0)
##

is probably sufficient.

Even better would be to break this out as a function:

###
def validateEmailsAndPasswords(email, email2, password, password2):
"""Check for valididated emails and passwords, and exit if either are
   invalid."""
if email != email2:
print "Email addresses do not match"
sys.exit(0)
elif password != password2:
print "Passwords do not match"
sys.exit(0)
###

but I guess we should take this one step at a time.

If you have any questions, please feel free to ask.  I hope this helps!

__

Re: [Tutor] Problem with a variable in a CGI program

2005-01-07 Thread Danny Yoo



> But now I have much weirder problem...
> I got this error:
>
> C:\>maillist.py
>   File "C:\maillist.py", line 84
>
>^
> SyntaxError: invalid syntax
>
> And the weird thing is that the program has only 83 lines... For some
> reason python thinks that I have a ^ on the 84 code line. Whats wrong ??



Hi Mark,

[meta: in your email reply, please just cut-and-paste the part that you're
responding to.]


Python error messages can put a carat symbol directly under the portion of
the code that looks weird, as a visual aid to the programmer.  For
example, if we have this program:

###
[EMAIL PROTECTED] dyoo]$ cat foo.py
  print foobar
###


then if we try to execute it, it'll break because the code is incorrectly
indented:

###
[EMAIL PROTECTED] dyoo]$ python foo.py
  File "foo.py", line 1
print foobar
^
SyntaxError: invalid syntax
###

And here, the carat is meant for us to see that things start going wrong
starting at the 'print' statement.




The error message that you're seeing,

> C:\>maillist.py
>   File "C:\maillist.py", line 84
>
>^
> SyntaxError: invalid syntax

is also trying to say that line 84 of the program has strange syntax
that's inconsistent with the rest of the program.  Most likely, it's an
indentation error.  Check line 84 and see if it really is just an empty
line.



> BTW, I didn't wrote this program. I downloaded it (you can find the URL
> in my previous message) and started to read it so I could make some
> changes in it...

Hmmm.  The code does need a revision.

Maybe we can fix it up and send it back to the original authors, so that
no one else has to encounter the bugs and problems that you're running
into.  *grin*

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] XP and python 2.4, some progress

2005-01-09 Thread Danny Yoo

On Sat, 8 Jan 2005, Jeffrey Thomas Peery wrote:

> I wasn't able to get the IDLE started in windows XP. I had it working then I
> upgraded to 2.4, then it didn't work so I switched back to 2.3, still didn't
> work so I'm back to 2.4.  I did some looking around and I was able to get
> the IDLE started by setting the shortcut on my desktop to:
>
> C:\Python24\python.exe C:\Python24\Lib\idlelib\idle.pyw -n

Hi Jeff,

Oh no, not again!  *grin*

> Not sure what this does. But is seems to get it going. What is going on
> here? Also I get a console that appears with this message - I have no
> idea what it means:
>
> Warning: configHandler.py - IdleConf.GetThemeDict - problem retrieving
> theme element 'builtin-background'

Ok, we've seen something like this before; what you are running into is
probably the same thing.  Mike and Jacob have found that IDLE broke on
them when upgrading from Python 2.3 to Python 2.4.  See the thread
starting from:

http://mail.python.org/pipermail/tutor/2004-December/033672.html

It turns out that the bug has to do with the way IDLE now handles its
configuration files.  If you've made some special configuration (like
color customization), the bug causes IDLE not to start up cleanly because
the customized config files aren't compatible.

To work around this, rename your '.idlerc/' directory to something else
temporarily.  The '.idlerc' directory should be somewhere in your home
within the 'Documents and Settings' directory.  The '.idlerc/' directory
contains all the user-defined settings that you've made to IDLE, so if we
hide it, IDLE should try to regenerate a clean set.

After renaming it to something else, try restarting IDLE with the icon,
and see if it comes up now.

I hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Input to python executable code and design question

2005-01-10 Thread Danny Yoo



> >To help you out. You need some sort of error checking to be sure that
> >within your given range you won't get something like a math domain
> >error.
> >
> >
> Yes, I thought that:
> try:
> #function
> exception:
> pass


Hi Ismael,


Python's keyword for exception handling is 'except', so this can be
something like:

###
try:
some_function()
except:
pass
###


... Except that the 'except' block should seldom be so vacant like that.
*grin* There are two things we should try to do when exception handling:
handle only what we need, and report exception information when we do.


We really want to get the system to handle only domain errors, and
otherwise let the exception raise errors.  But because the block above
catches every Python error, even silly things like misspellings, it will
sweep programmer errors under the carpet.


For example, the code snippet:

###
>>> def sqrt(x):
... return y**0.5  ## bug: typo, meant to type 'x'
...
>>> try:
... sqrt("hello")
... except:
... pass
...
###

tries to make it look like we're catching domain errors, but the
overzealous exception catching disguises a really silly bug.


We can make the exception more stringent, so that we only catch domain
errors.  According to:

http://www.python.org/doc/lib/module-exceptions.html

there's are a few standard exceptions that deal with domains, like
ArithmeticError, ValueError, or TypeError.


Let's adjust the code block above so we capture only those three:

###
>>> try:
... sqrt("hello")
... except (ArithmeticError, ValueError, TypeError):
... pass
...
Traceback (most recent call last):
  File "", line 2, in ?
  File "", line 2, in sqrt
NameError: global name 'y' is not defined
###



Also, it might be a good idea to print out that a certain exception
happened, just so that you can tell the user exactly what's causing the
domain error.  The 'traceback' module can help with this:

http://www.python.org/doc/lib/module-traceback.html


For example:

###
try:
1 / 0
except:
print "There's an error!"
###

tells us that some wacky thing happened, but:


###
import traceback
try:
1 / 0
except:
print "Error:", traceback.format_exc()
###

tells us that the error was caused by a ZeroDivisionError.



Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Slightly OT - Python/Java

2005-01-10 Thread Danny Yoo

On Tue, 11 Jan 2005, Liam Clarke wrote:

> > > (Why can't a non-static method comparison be called from a static
> > > reference? What does that mean anyway?
> >
> > Er... What was your code like? (before and after correcting
> > the error)

Hi Liam,

It's actually easier to see the reason if we do a comparative "buggy code"
display between Java and Python.  *grin*

Here's an instance of what I think the error might be:

/*** Java pseudocode: won't compile. ***/
public class TestStatic {
public static void doSomething() {
System.out.println("square(42) is " + square(42));
}
public int square(int x) {
return x * x;
}
}
/**/

And here's a loose translation of the buggy Java code into buggy Python
code.  [meta: I'm not using Python's staticmethod() stuff yet; maybe
someone else can show that approach too?]

### Python pseudocode: won't run right ###
def doSomething():
print "square(42) is", TestStatic.square(42)

class TestStatic:
def square(self, x):
return x * x
##

The bug should be more obvious now: it's a simple parameter passing
problem.

In Python, the 'self' parameter is explicitely defined and stated as part
of the parameter list of any method.  square() is actually a method that
must take in an instance.  Easy to see, since square takes in two
parameters, but we are only passing one.

In Java, the bug is harder to see, because the passing of 'this' (Java's
equivalent of 'self') is all being done implicitely.  In Java, static
functions aren't bound to any particular instance, and since there's no
'this', we get that error about how "static member functions can't call
nonstatic method functions."

That being said, it's perfectly possible to do this:

/**/
public class TestStatic {
public static void doSomething() {
TestStatic app = new TestStatic();
System.out.println("square(42) is " + app.square(42));
}

public int square(int x) {
return x * x;
}
}
/**/

with its Python equivalent:

##
def doSomething():
app = TestStatic()
print "square(42) is", app.square(42)

class TestStatic:
def square(self, x):
return x * x
##

The explicitness of 'self' in Python really shines in examples like this.

> > Oh Ghost. You didn't actually write a Java program using a
> > regular text editor, did you?
> > And of course, it's free (and written in Java). 
> > http://www.eclipse.org
> >
>
> *sigh* I have no net at home at moment, which is very frustrating when I
> want to d/l documentation & editors. For the mo, it's all Notepad. Ick.

Don't inflict that kind of pain on yourself.  Ask someone to burn a CD of
Eclipse for you or something: just don't use bad tools like Notepad!
*grin*

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Is there a way to force update the screen in tkinter?

2005-01-10 Thread Danny Yoo

On Mon, 10 Jan 2005, R. Alan Monroe wrote:

> I don't have the code here at home, but today I tried my first
> experiments in Tkinter. I set up a button that fired off a function to
> resize a rectangle in a canvas, with a for loop. Only problem is that
> the screen isn't repainted in all the steps of the for loop - only at
> the very end, when the rectangle is at its final, largest size. Can I
> make it repaint DURING the loop?

Hi Alan,

Yes, Tkinter queries up a bunch of events to refresh the window, but
doesn't execute them directly until it has a chance to take control, that
is, when control passes back into the mainloop.

You can manually force a refresh of the window-refreshing events through
the loop.  Use update_idletasks() methods on the toplevel Tk, and it
should refresh things properly.  See:

http://www.pythonware.com/library/tkinter/introduction/x9374-event-processing.htm

which talks about it a little more.

Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python with MySQL ?

2005-01-11 Thread Danny Yoo



On Tue, 11 Jan 2005, Mark Kels wrote:

> How can I send SQL querys to a MySQL database with a python-CGI program ?

Hi Mark,


You'll want to grab a "Python DB API" module for MySQL.  The best one I've
seen for MySQL is 'MySQLdb':

http://sourceforge.net/projects/mysql-python

and you should probably grab that for your system.  It should come with
some examples to help you get started.


Python.org has a section on Python's database support:

http://www.python.org/topics/database/

with some documentation.  There used to be a tutorial linked from Linux
Journal there, but it's now restricted to subscribers!  *g*



Here's an example program just to see how the pieces fit together:

###
import MySQLdb
connection = MySQLdb.connect(db="magic", user="dyoo",
 password="abracadabra")
cursor = connection.cursor()
cursor.execute("""select name from cards where
  tournament_type = 'restricted'""")
for (name,) in cursor.fetchall():
print name
cursor.close()
connection.close()
###


I hope this helps you get started!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Time script help sought!

2005-01-11 Thread Danny Yoo

On Tue, 11 Jan 2005, kevin parks wrote:

> but as always you may notice a wrinkle some items have many times
> (here 6) indicated:
>
> Item_3TAPE_139:4110:41
> Item_3TAPE_1410:4711:19
> Item_3TAPE_1511:2111:55
> Item_3TAPE_1611:5812:10
> Item_3TAPE_1712:1512:45Defect in analog tape sound.
> Item_3TAPE_1812:5824:20Defect in analog tape sound.

Hi Kevin,

It may help make things more managable if you work on a smaller
subproblem.

Let's look at the time-joining problem and see if that's really as hard as
it looks.  Let's just select the time coordinates and concentrate on those
for now.

##
9:4110:41
10:4711:19
11:2111:55
11:5812:10
12:1512:45
12:5824:20
##

I notice here that this representation is in minutes and seconds.  It
might be easier if we make the data all in seconds: we can do our numeric
calculations much more easily if we're dealing with a single unit.

###
def convertMinSecToSeconds(minSecString):
"""Converts a string of the form:

   min:sec

   into the integer number of seconds.
"""
min, sec = minSecString.split(":")
return (int(min) * 60) + int(sec)
###

If we need to go back from seconds back to the minute-second
representation, we can write a function to go the other direction.

Anyway, with that, we can now look at the problem purely in seconds:

###
>>> times = """
... 9:4110:41
... 10:4711:19
... 11:2111:55
... 11:5812:10
... 12:1512:45
... 12:5824:20
... """.split()
>>>
>>> times
['9:41', '10:41', '10:47', '11:19', '11:21', '11:55', '11:58', '12:10',
'12:15', '12:45', '12:58', '24:20']
>>>
>>> seconds = map(convertMinSecToSeconds, times)
>>> seconds
[581, 641, 647, 679, 681, 715, 718, 730, 735, 765, 778, 1460]
###

That is, we now have some input, like:

##
initialInput = [(581, 641),
(647, 679),
(681, 715),
(718, 730),
(735, 765),
(778, 1460)]
##

And now we want to turn it into something like:

###
expectedResult = [(0, 60),
  (60, 98),
  (98, 134),
  (134, 149),
  (149, 184),
  (184, 879)]
###

Can you write a function that takes this 'initialInput' and produces that
'expectedResult'?  If so, your problem's pretty much solved.

If you have more questions, please feel free to ask.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] FW: hi

2005-01-11 Thread Danny Yoo

On Wed, 12 Jan 2005, Gopinath V, ASDC Chennai wrote:

> >   I'm a Novice in python.can u tell me if there is a frontend available
> > .if yes where can i download it from

Hello Gopinath,

Can you explain what you mean by "frontend"?  The word "frontend" suffers
from being too generic to be able to tell what you mean.

(As a concrete example, a search for "python frontend" in Google gives me
a GCC compiler frontend as its first hit, and I'm almost positively sure
that's not what you're looking for.  *grin*)

Are you looking for an Integrated Development Environment, like Boa
Constructor?

http://boa-constructor.sourceforge.net/

If you are looking for IDEs, then there is a list of ones that support
Python here:

http://www.python.org/moin/IntegratedDevelopmentEnvironments

Please tell us more what you are looking for, and we'll try to do our best
to hunt that information down.  Good luck to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

RE: [Tutor] FW: hi (fwd)

2005-01-12 Thread Danny Yoo


>> Can you explain what you mean by "frontend"?  The word "frontend"
>> suffers from being too generic to be able to tell what you mean.

>   I meant a GUI like Microsofts Visual Basic


[Keeping Tutor@python.org in CC.  Please use Reply-to-all in replies, so
that we can keep the conversation on list.]


Ok, in that case, you may want to take a look at the
IntegratedDevelopmentEnvironments page:

http://www.python.org/moin/IntegratedDevelopmentEnvironments

I can't really give personal experience about these, since I'm still using
tools like Emacs.  *grin* But I'm sure that others on the Tutor list can
give their recommendations to you.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] class instance with identity crisis

2005-01-12 Thread Danny Yoo

On Wed, 12 Jan 2005, Barnaby Scott wrote:

> I was wondering how you can get an instance of a class to change itself
> into something else (given certain circumstances), but doing so from
> within a method. So:
>
> class Damson:
> def __str__(self):
> return 'damson'
>
> def dry(self):
> self = Prune()
>
> class Prune:
> def __str__(self):
> return 'prune'
>
> weapon = Damson()
> weapon.dry()
> print weapon

Hi Scott,

The issue with that, as you know, is that it conflicts with the way local
variables work in functions.  For example:

###
>>> def strip_and_print(x):
... x = x.strip()
... print x
...
>>> message = "   hello   "
>>> strip_and_print(message)
hello
>>> message
'   hello   '
###

Our reassignment to 'x' in strip_and_print has no effect on "message"'s
binding to "  hello ".  That's how we're able to use local variables as
black boxes.

For the same reasons, the reassignment to 'self in:

> class Damson:
> def __str__(self):
> return 'damson'
>
> def dry(self):
> self = Prune()

is limited in scope to the dry() method.

But can you do what you're trying to do with object composition?  That is,
would something like this work for you?

###
class Damson:
def __init__(self):
self.state = DamsonState()
def __str__(self):
return str(self.state)
def dry(self):
self.state = PruneState()

class DamsonState:
def __str__(self):
return "damson"

class PruneState:
def __str__(self):
return 'prune'
###

This structuring allows us to switch the way that str() applies to Damson.
The OOP Design Pattern folks call this the "State" pattern:

http://c2.com/cgi/wiki?StatePattern

If you have any questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] TypeError I don't understand

2005-01-12 Thread Danny Yoo

On Wed, 12 Jan 2005, Vincent Wan wrote:

> running my program gives me "TypeError: len() of unsized object"

Hi Vincent:

Ah!  Whenever you see something like this, check to see where
reassignments to the variable occur: it's possible that one of the
assignments isn't doing what you think it's doing.

The one that's causing the problem is this:

###
living_species = living_species.remove(counter)
###

remove() does not return back the modified list: it makes changes directly
on the list.

Just do:

###
living_species.remove(counter)
###

and that should correct the problem.

A similar bug occurs on the line right above that one: append() also does
not return a modified list, but does mutations on the given list instead.
You'll need to fix that too, even though you're not getting an immediate
error message out of it.

Let's go back and look at the error message, so that when you see it, it
isn't mysterious.  *grin* Now that you know that remove() doesn't return a
list, the error message should make more sense, because remove() returns
None.

###
>>> len(None)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: len() of unsized object
###

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] what's a concise way to print the first elements in a nested list

2005-01-12 Thread Danny Yoo

On Wed, 12 Jan 2005, Orri Ganel wrote:

>  >>> stuff = [[0,'sdfsd','wrtew'], [1, 'rht','erterg']]
>  >>> stuff
> [[0, 'sdfsd', 'wrtew'], [1, 'rht', 'erterg']]
>  >>> print [stuff[i][0] for i in range(len(stuff))]
> [0, 1]

Hi Orri,

An alternative way to write this is:

###
print [row[0] for row in stuff]
###

which extracts the first element out of every "row" sublist in 'stuff'.

Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-12 Thread Danny Yoo

On Wed, 12 Jan 2005, Marilyn Davis wrote:

> When stuff was read from the exim socket, it was stored in a tempfile,
> so that I could release the exim process, then I lseek to the front of
> the tempfile and have it handy.  I see from all my debugging and logging
> that the file descriptor for this tempfile is 9.

Hi Marilyn,

Question: do you really need to use a tempfile?  If your daemon is
persistent in memory, can it just write to a StringIO object?

StringIO.StringIO elements act like files too, and may be easier to
maintain than a tempfile.TemporaryFile.  If you can show us how you're
constructing and using the TemporaryFile, we can try to trace any
problematic usage.

> The program then opens a pipe to exim to send mail.  I see that the
> popen2.popen3 call returns 9 for the stdin file descriptor for the pipe.
>
> The tempfile (also file descriptor 9) is read for piping into exim and
> it errors with "Bad file descriptor".

Oh!  This situation sounds like the 'tempfile' is being closed at some
point, releasing the file descriptor back to the system.  If that is what
is happening, then that's why the pipe has the same descriptor id: the
call to pipe() just reuses a free file descriptor.

I'd look for places where the tempfile might be close()d.  I'd also look
for places where your reference to tempfile is reassigned, since that
would also signal a resource collection.

> Worse yet, the first 5 messages of my test go through the entire process
> without a problem, and then # 6 hits this -- but only if # 1 message is
> really big.

Just double checking something: are you dealing with threads?

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-12 Thread Danny Yoo


> Just double checking something: are you dealing with threads?

Hi Marilyn,

Argh, that was a dumb question.  Pretend I didn't ask it that way.
*grin*


I meant to ask:

How do you deal with threads?  Is the temporary file a global resource
that the threads all touch?  If so, have you done any synchronization to
make sure that at most one thread can touch the temporary file at a time?
What are the shared resources for the threads?


The situation you mentioned,

> > Worse yet, the first 5 messages of my test go through the entire
> > process without a problem, and then # 6 hits this -- but only if # 1
> > message is really big.

is exactly the sort of thing I'd expect if two threads were contending for
the same resource, so let's see if the bug has to do with this.


Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-12 Thread Danny Yoo

On Wed, 12 Jan 2005, Marilyn Davis wrote:

> I was looking at my use of file objects and file descriptors and I wrote
> this sample program and was very surprised by the result -- which makes
> me think there's something here that I don't understand. Where did my
> 'ooo' go?
>
> #! /usr/bin/env python
> import os
>
> fobj = open('/tmp/xxx','w')
> fobj.write('ooo\n')
> fp = fobj.fileno()
> os.write(fp,'x\n')
> os.close(fp)

Hi Marilyn,

Oh!  Can you explain why you're mixing the low-level 'os.write()' and
'os.close()' stuff with the high-level file methods?

The 'os' functions work at a different level of abstraction than the file
object methods, so there's no guarantee that:

os.close(fp)

will do the proper flushing of the file object's internal character
buffers.

Try this instead:

###
fobj = open('/tmp/xxx','w')
fobj.write('ooo\n')
fobj.write('x\n')
fobj.close()
###

The documentation on os.write() says:

"""Note: This function is intended for low-level I/O and must be applied
to a file descriptor as returned by open() or pipe(). To write a ``file
object'' returned by the built-in function open() or by popen() or
fdopen(), or sys.stdout or sys.stderr, use its write() method."""

(http://www.python.org/doc/lib/os-fd-ops.html#l2h-1555)

I think the documentation is trying to say: "don't mix high-level and
low-level IO".

For most purposes, we can usually avoid using the low-level IO functions
os.open() and os.write().  If we're using the low-level file functions
because of pipes, then we can actually turn pipes into file-like objects
by using os.fdopen().  os.fdopen() is a bridge that transforms file
descriptors into file-like objects.  See:

http://www.python.org/doc/lib/os-newstreams.html

for more information on os.fdopen().

I hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] flattening a list

2005-01-12 Thread Danny Yoo


> > def flatten(a):
> >if not isinstance(a,(tuple,list)): return [a]
> >if len(a)==0: return []
> >return flatten(a[0])+flatten(a[1:])

> The only problem with this if it is to big or to deeply nested then it
> will overflow the stack?


Yes, it can overflow in its current incarnation.  There is a way to fix
that.


[WARNING WARNING: The code presented below is very evil, but I just can't
resist.  *grin*

Please do not try to understand the code below, as it is not meant to be
read by humans.  If you are just learning Python, please skip this
message.]


There's a way to systematically transform it so that it doesn't overflow,
by using "trampolining-style" programming.  This technique turns any
recursive function into one that only consumes a constant amount of stack
space.

It's used by programming language theory folks, despite being utterly
weird.  *grin*  I think I wrote a brief introduction to the topic on
Python-Tutor list, and have also applied it in a small project (PyScheme).

For your (or my?) amusement, here's the transformed flatten() function in
trampolined style:

###
def flatten(a):
"""Flatten a list."""
return bounce(flatten_k(a, lambda x: x))


def bounce(thing):
"""Bounce the 'thing' until it stops being a callable."""
while callable(thing):
thing = thing()
return thing


def flatten_k(a, k):
"""CPS/trampolined version of the flatten function.  The original
function, before the CPS transform, looked like this:

def flatten(a):
if not isinstance(a,(tuple,list)): return [a]
if len(a)==0: return []
return flatten(a[0])+flatten(a[1:])

The following code is not meant for human consumption.
"""
if not isinstance(a,(tuple,list)):
return lambda: k([a])
if len(a)==0:
return lambda: k([])
def k1(v1):
def k2(v2):
return lambda: k(v1 + v2)
return lambda: flatten_k(a[1:], k2)
return lambda: flatten_k(a[0], k1)
###


This actually does something useful.

###
>>> flatten([1, [2, [3, 4]]])
[1, 2, 3, 4]
###


Best of all, it does not stack-overflow.  We can test this on an evil
constructed example:

###
>>> def makeEvilList(n):
... result = []
... for i in xrange(n):
... result = [[result], i]
... return result
...
>>> evilNestedList = makeEvilList(sys.getrecursionlimit())
>>> assert range(sys.getrecursionlimit()) == flatten(evilNestedList)
>>>
###

And it does work.

That being said, "trampolined-style" is just evil in Python: I would not
want to inflict that code on anyone, save in jest.  *grin*

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regular expression re.search() object . Please help

2005-01-13 Thread Danny Yoo

On Thu, 13 Jan 2005, kumar s wrote:

> My list looks like this: List name = probe_pairs
> Name=AFFX-BioB-5_at
> Cell1=96  369 N   control AFFX-BioB-5_at
> Cell2=96  370 N   control AFFX-BioB-5_at
> Cell3=441 3   N   control AFFX-BioB-5_at
> Cell4=441 4   N   control AFFX-BioB-5_at
> Name=223473_at
> Cell1=307 87  N   control 223473_at
> Cell2=307 88  N   control 223473_at
> Cell3=367 84  N   control 223473_at
>
> My Script:
> >>> name1 = '[N][a][m][e][=]'

Hi Kumar,

The regular expression above can be simplified to:

'Name='

The character-class operator that you're using, with the brackets '[]', is
useful when we want to allow different kind of characters.  Since the code
appears to be looking at a particular string, the regex can be greatly
simplified by not using character classes.

> >>> for i in range(len(probe_pairs)):
>   key = re.match(name1,probe_pairs[i])
>   key
>
>
> <_sre.SRE_Match object at 0x00E37A68>
> <_sre.SRE_Match object at 0x00E37AD8>
> <_sre.SRE_Match object at 0x00E37A68>
> <_sre.SRE_Match object at 0x00E37AD8>
> <_sre.SRE_Match object at 0x00E37A68>
> . (cont. 10K
> lines)
>
> Here it prints a bunch of reg.match objects. However when I say group()
> it prints only one object why?

Is it possible that the edited code may have done something like this?

###
for i in range(len(probe_pairs)):
key = re.match(name1, probe_pairs[i])
print key
###

Without seeing what the literal code looks like, we're doomed to use our
imaginations and make up a reasonable story.  *grin*

> >>> for i in range(len(probe_pairs)):
>   key = re.match(name1,probe_pairs[i])
>   key.group()

Ok, I think I see what you're trying to do.  You're using the interactive
interpreter, which tries to be nice when we use it as a calculator.  The
interactive interpreter has a special feature that prints out the result
of expressions, even though we have not explicitely put in a "print"
statement.

When we using a loop, like:

###
>>> for i in range(10):
... i, i*2, i*3
...
(0, 0, 0)
(1, 2, 3)
(2, 4, 6)
(3, 6, 9)
(4, 8, 12)
(5, 10, 15)
(6, 12, 18)
(7, 14, 21)
(8, 16, 24)
(9, 18, 27)
###

If the body of the loop contains a single expression, then Python's
interactive interpreter will try to be nice and print that expression
through each iteration.

The automatic expression-printing feature of the interactive interpreter
is only for our convenience.  If we're not running in interactive mode,
Python will not automatically print out the values of expressions!

So in a real program, it is much better to explicity write out the command
statement to 'print' the expression to screen, if that's what you want:

###
>>> for i in range(10):
... print (i, i*2, i*3)
...
(0, 0, 0)
(1, 2, 3)
(2, 4, 6)
(3, 6, 9)
(4, 8, 12)
(5, 10, 15)
(6, 12, 18)
(7, 14, 21)
(8, 16, 24)
(9, 18, 27)
###

> After I get the reg.match object, I tried to remove
> that match object like this:
> >>> for i in range(len(probe_pairs)):
>   key = re.match(name1,probe_pairs[i])
>   del key
>   print probe_pairs[i]

The match object has a separate existance from the string
'probe_pairs[i]'.  Your code does drop the 'match' object, but this has no
effect in making a string change in probe_pairs[i].

The code above, removing those two lines that play with the 'key', reduces
down back to:

###
for i in range(len(probe_pairs)):
print probe_pairs[i]
###

which is why you're not seeing any particular change in the output.

I'm not exactly sure you really need to do regular expression stuff here.
Would the following work for you?

###
for probe_pair in probe_pairs:
if not probe_pair.startswith('Name='):
print probe_pair
###

> Name=AFFX-BioB-5_at
> Cell1=96  369 N   control AFFX-BioB-5_at
> Cell2=96  370 N   control AFFX-BioB-5_at
> Cell3=441 3   N   control AFFX-BioB-5_at
>
> Result shows that that Name** line has not been deleted.

What do you want to see?  Do you want to see:

###
AFFX-BioB-5_at
Cell1=96369 N   control AFFX-BioB-5_at
Cell2=96370 N   control AFFX-BioB-5_at
Cell3=441   3   N   control AFFX-BioB-5_at
###

or do you want to see this instead?

###
Cell1=96369 N   control AFFX-BioB-5_at
Cell2=96370 N   control AFFX-BioB-5_at
Cell3=441   3   N   control AFFX-BioB-5_at
###

Good luck to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Tix and Table printing

2005-01-13 Thread Danny Yoo



On Thu, 13 Jan 2005, Guillermo Fernandez Castellanos wrote:

> Is there any "table" frame that I am unaware of?


Hi Guillermo,

There are some recipes for making a Tkinter table widget; here's one:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52266


There's a Tkinter wiki page that has links to a few more Table
implementations:

http://tkinter.unpythonic.net/wiki/FrontPage


Hope this helps!


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] communication between java and python?

2005-01-13 Thread Danny Yoo

On Thu, 13 Jan 2005, joeri honnef wrote:

> I'm trying to communicate between Python and Java and using os.popen().
> But thing dont work...The Java program reads strings from stdin and the
> python program just writes to stdout.

Hi Joeri,

You may want to look at:

http://www.python.org/doc/lib/popen2-flow-control.html

which explains some of the tricky flow-control issues that can happen
during inter-process communication.  As far as I can tell, the Java code
that you have to read a character from standard input:

/**/
// return EOF if end of file or IO error
private static void readC() {
try { c = System.in.read(); }
catch(IOException e) { c = EOF; }
}
/**/

has the possibility of blocking if the input stream has not been closed.

Other folks have already commented that, in the interaction that you're
doing with the program:

> handlers = os.popen('java StdIn)
> handlers[0].write('string')
> return_value_from_java = handlers[1].read()

your program needs to use something like popen2 to get separate input and
output files.  But the program also needs to be aware to close the
streams, or else risk the potential of deadlocking.

If it helps to see why, imagine that we modify:

> handlers[0].write('string')
> return_value_from_java = handlers[1].read()

to something like:

###
handlers[0].write('s')
handlers[0].write('t')
handlers[0].write('rin')
handlers[0].write('g')
###

We can call write() several times, and as far as the Java process knows,
it might need to expect more input.  That's why closing the input stream
is necessary, to signal to the Java program, through EOF, that there's no
more characters to read.

If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] file-like object

2005-01-14 Thread Danny Yoo

On Fri, 14 Jan 2005, Chad Crabtree wrote:

> I have created a file-like object out of a triple quoted string.  I was
> wondering if there is a better way to implement readline than what I
> have below? It just doesn't seem like a very good way to do this.
>
> class _macroString(object):
> def __init__(self,s):
> self.macro=s
> self.list=self.macro.split("\n")
> for n,v in enumerate(self.list):
> self.list[n]=v+'\n'
> def readline(self,n=[-1]):
> n[0]+=1
> return self.list[n[0]]
> def __str__(self):
> return str(self.list)
> def __len__(self):
> return len(self.list)

Using the default parameter 'n' in the readline() method isn't safe: all
class instances will end up using the same 'n'.  You may want to put the
current line number as part of an instance's state, since two instances of
a macroString should be able to keep track of their line positions
independently.

But that being said, there's already a module in the Standard Library that
turns strings into file-like objects.  Can you use StringIO.StringIO?

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] file-like object

2005-01-14 Thread Danny Yoo

On Fri, 14 Jan 2005, Terry Carroll wrote:

> > class _macroString(object):
> > def __init__(self,s):
> > self.macro=s
> > self.list=self.macro.split("\n")
> > for n,v in enumerate(self.list):
> > self.list[n]=v+'\n'
>
> Is this for loop a safe technique, where the list you're enumerating
> over in the for statement is the same as the one being updated in the
> loop body?  I always avoid things like that.

Hi Terry,

The 'for' loop itself should be ok, as it isn't a change that causes
elements to shift around.  The Reference Manual tries to explain the issue
on in-place list mutation here:

http://docs.python.org/ref/for.html

So as long as the structure of the list isn't changing, we should be ok to
do mutations on the list elements.

Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Faster procedure to filter two lists . Please help

2005-01-14 Thread Danny Yoo

On Fri, 14 Jan 2005, kumar s wrote:

> >>>for i in range(len(what)):
>   ele = split(what[i],'\t')
>   cor1 = ele[0]
>   for k in range(len(my_report)):
>   cols = split(my_report[k],'\t')
>   cor = cols[0]
>   if cor1 == cor:
>   print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2]

Hi Kumar,

Ok, this calls for the use of an "associative map" or "dictionary".

The main time sink is the loop here:

>   for k in range(len(my_report)):
>   cols = split(my_report[k],'\t')
>   cor = cols[0]
>   if cor1 == cor:
>   print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2]

Conceptually, my_report can be considered a list of key/value pairs.  For
each element in 'my_report', the "key" is the first column (cols[0]), and
the "value" is the rest of the columns (cols[1:]).

The loop above can, in a pessimistic world, require a search across the
whole of 'my_report'.  This can take time that is proportional to the
length of 'my_report'.  You mentioned earlier that each list might be of
length 249502, so we're looking into a process whose overall cost is
gigantic.

[Notes on calculating runtime cost: when the structure of the code looks
like:

for element1 in list1:
for element2 in list2:
some_operation_that_costs_K_time()

then the overall cost of running this loop will be

K * len(list1) * len(list2)
]

We can do much better than this if we use a "dictionary" data structure. A
"dictionary" can reduce the time it takes to do a lookup search down from
a linear-time operation to an atomic-time one.  Do you know about
dictionaries yet?  You can take a look at:

http://www.ibiblio.org/obp/thinkCSpy/chap10.htm

which will give an overview of a dictionary.  It doesn't explain why
dictionary lookup is fast, but we can talk about that later if you want.

Please feel free to ask any questions about dictionaries and their use.
Learning how to use a dictionary data structure is a skill that pays back
extraordinarily well.

Good luck!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-15 Thread Danny Yoo



> I have only wrapped my lock around file-descriptor creations.  Should I
> wrap it around closings too?  Or the whole open -> close transaction?
> It sounds like error-prone work to do the latter.  What am I missing?

Hi Marilyn,

Can you send a link to the source code to the Tutor list?  I'm getting the
feeling that there's might be a design problem.  Just adding locks
whenever something doesn't work is not a sustainable way to write a
multithreaded application.

We have to see why your file descriptors being are being shared between
threads.  Is there a reason why you need to share them as global
resources?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] RE:

2005-01-17 Thread Danny Yoo



> >  >>> stuff = [[0,'sdfsd','wrtew'], [1, 'rht','erterg']]
> >  >>> stuff
> > [[0, 'sdfsd', 'wrtew'], [1, 'rht', 'erterg']]
> >  >>> print [stuff[i][0] for i in range(len(stuff))]
> > [0, 1]
> >
> > An alternative way to write this is:
> >
> > ###
> > print [row[0] for row in stuff]
> > ###
> >
> > which extracts the first element out of every "row" sublist in
> > 'stuff'.


> This is fine.  I just want to know if row is a reserve word? or is it a
> built in function in IDLE environment.  The word row is not highlighted.
> What data type is (row)?


Hello!

When we have 'stuff' like this:

###
>>> stuff = [[0,'sdfsd','wrtew'], [1, 'rht','erterg']]
###

then we can ask for an element out of 'stuff'.  One thing we can do is
variable assignment:

###
>>> row = stuff[0]
###

'row' here is just an arbitrarily chosen variable name.  We can see that
"row"'s value is one of the sublists in 'stuff' by trying:

###
>>> print row
[0, 'sdfsd', 'wrtew']
###


We can also use a 'for' loop to march --- to "iterate" --- across a list:

###
>>> for row in stuff:
... print row, "is another element in 'stuff'"
...
[0, 'sdfsd', 'wrtew'] is another element in 'stuff'
[1, 'rht', 'erterg'] is another element in 'stuff'
###

'row' here is also used as a temporary variable name.  In a 'for' loop, it
is assigned to each element, as we repeat the loop's body.


If you have more questions, please feel free to ask.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-18 Thread Danny Yoo




Hi Marilyn,


[Long program comments ahead.  Please forgive me if some of the comments
are overbearing; I'm trying to get over a cold, and I'm very grumpy.
*grin*]


Some comments on the code follow.  I'll be focusing on:

> http://www.maildance.com/python/doorman/py_daemon.py

One of the import statements:

###
from signal import *
###

may not be safe.  According to:

http://docs.python.org/tut/node8.html#SECTION00841

"""Note that in general the practice of importing * from a module or
package is frowned upon, since it often causes poorly readable code.
However, it is okay to use it to save typing in interactive sessions, and
certain modules are designed to export only names that follow certain
patterns."""




There's a block of code in Server.run() that looks problematic:

###
while 1:
if log.level & log.calls:
log.it("fd%d:py_daemon.py: Waiting ...", self.descriptor)
try:
client_socket, client_addr = self.server_socket.accept()
except socket.error, msg:
time.sleep(.5)
continue
except (EOFError, KeyboardInterrupt):
self.close_up()
Spawn(client_socket).start()
###

The potentially buggy line is the last one:

Spawn(client_socket).start()


The problem is that, as part of program flow, it appears to run after the
try block.  But in one particular case of program flow, 'client_socket'
will not be set to a valid value.  It is better to put that statement a
few lines up, right where 'client_socket' is initialized.  Like this:

###
try:
client_socket, client_addr = self.server_socket.accept()
Spawn(client_socket).start()
except socket.error, msg:
time.sleep(.5)
except (EOFError, KeyboardInterrupt):
self.close_up()
###

Not only does this make it more clear where 'client_socket' is being used,
but it ends up making the code shorter.  I've dropped the 'continue'
statement, as it becomes superfluous when the Spawn() moves into the try's
body.

In fact, the placement of that statement may account partially for the
error message that you were running into earlier.  The earlier error
message:

###
close failed: [Errno 9] Bad file descriptor
###

can easliy occur if there's an EOFError or KeyboardInterrupt under the
original code.  And although KeyboardInterrupt might be unusual, I'm not
so sure if EOFError is.



Furthermore, the 'except' blocks can emit more debugging information.
You mentioned earlier that you're not getting good error tracebacks when
exceptions occur.  Let's fix that.

Use the 'traceback' module here for all except blocks in your program.
This will ensure that you get a good stack trace that we can inspect.

Here's what the code block looks like, with those adjustments:

###
try:
client_socket, client_addr = self.server_socket.accept()
Spawn(client_socket).start()
except socket.error, msg:
time.sleep(.5)
log.it(traceback.format_exc())
except (EOFError, KeyboardInterrupt):
self.close_up()
log.it(traceback.format_exc())
###

I'm using Python 2.4's format_exc() function to get the stack trace as a
string.  If this version is not available to you, you can use this
workaround format_exc():

###
import StringIO
import traceback

def format_exc():
"""Returns the exception as a string."""
f = StringIO.StringIO()
traceback.print_exc(file = f)
return f.getvalue()
###

Once you've made the correction and augmented all the except blocks with
traceback logging, try running your program again.  We should then be
better able to trace down the root cause of those errors.


Let me just look through a little bit more, just to see what else we can
pick out.

>From what I read of the code so far, I believe that a lot of the locking
that the code is doing is also superfluous.  You do not need to lock local
variables, nor do you have to usually lock things during object
initialization.  For example, FileReader.__init__()  has the following
code:

###
class FileReader(TokenReader):
def __init__(self, file_socket):
self.local_name = file_socket.local_name
self.freader_name = '/tmp/fsf%s' % self.local_name
file_lock.acquire()
self.freader = open(self.freader_name, "w+")
file_lock.release()
###

The locks around the open() calls are unnecessary: you do not need to
synchronize file opening here, as there's no way for another thread to get
into the same initializer of the same instance.

In fact, as far as I can tell, none of the Spawn() threads are
communicating with each other.  As long as your threads are working
independently of each other --- and as long as they are not writing to
global variables --- you do not need locks.

In summary, a lot of that locking

Re: [Tutor] sockets, files, threads

2005-01-18 Thread Danny Yoo

On Tue, 18 Jan 2005, Danny Yoo wrote:

> In fact, as far as I can tell, none of the Spawn() threads are
> communicating with each other.  As long as your threads are working
> independently of each other --- and as long as they are not writing to
> global variables --- you do not need locks.
>
> In summary, a lot of that locking code isn't doing anything, and may
> actually be a source of problems if we're not careful.

Hi Marilyn,

I thought about this a little bit more: there is one place where you do
need locks.  Locking needs to be done around Spawn's setting of its 'no'
class attribute:

###
class Spawn(threading.Thread):
no = 1
def __init__(self, client_socket):
Spawn.no = Spawn.no + 2 % 3
## some code later...
self.local_name = str(Spawn.no)

The problem is that it's possible that two Spawn threads will be created
with the same local_name, because they are both using a shared resource:
the class attribute 'no'.  Two thread executions can interleave.  Let's
draw it out.

Imagine we bring two threads t1 and t2 up, and that Spawn.no is set to 1.
Now, as both are initializing, imagine that t1 runs it's initializer

### t1 ###
def __init__(self, client_socket):
Spawn.no = Spawn.no + 2 % 3
##

and then passes up control.

At this point, Spawn.no = 3.  Imagine that t2 now starts up:

### t2 ###
def __init__(self, client_socket):
Spawn.no = Spawn.no + 2 % 3
### some code later
self.local_name = str(Spawn.no)
##

Now Spawn.no = 5, and that's what t2 uses to initialize itself.  When t1
starts off where it left off,

### t1 ###
### some code later
self.local_name = str(Spawn.no)
##

it too uses Spawn.no, which is still set to 5.

That's the scenario we need to prevent.  Spawn.no is a shared resource: it
can have only one value, and unfortunately, both threads can use the same
value for their own local_names.  This is a major bug, because much of
your code assumes that the local_name attribute is unique.

This is a situation where synchronization locks are appropriate and
necessary.  We'll want to use locks around the whole access to Spawn.no.
Something like:

###
class Spawn(threading.Thread):
no = 1
no_lock = threading.Lock()
def __init__(self, client_socket):
no_lock.acquire()
try:
Spawn.no = Spawn.no + 2 % 3
 ## ... rest of initializer body is here
finally:
no_lock.release()
###

should do the trick.  I'm being a little paranoid by using the try/finally
block, but a little paranoia might be justified here.  *grin*

We need to be careful of how locks work.  The following code is broken:

###
class Spawn(threading.Thread):
no = 1
no_lock = threading.Lock()
def __init__(self, client_socket):## buggy
no_lock.acquire()
Spawn.no = Spawn.no + 2 % 3
no_lock.release()

self.descriptor = client_socket.fileno()
self.client_socket = client_socket

no_lock.acquire()
self.local_name = str(Spawn.no)
no_lock.release()
### ...
###

This code suffers from the same bug as the original code: two threads can
come in, interleave, and see the same Spawn.no.  It's not enough to put
lock acquires() and releases() everywhere: we do have to use them
carefully or else lose the benefit that they might provide.

Anyway, I hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-18 Thread Danny Yoo

On Tue, 18 Jan 2005, Marilyn Davis wrote:

> while 1:
> if log.level & log.calls:
> log.it("fd%d:py_daemon.py: Waiting ...", self.descriptor)
> try:
> client_socket, client_addr = self.server_socket.accept()
> except (EOFError, KeyboardInterrupt):
> self.close_up()
> Spawn(client_socket).start()
>
> > The problem is that, as part of program flow, it appears to run after
> > the try block.  But in one particular case of program flow,
> > 'client_socket' will not be set to a valid value.  It is better to put
> > that statement a
>
> I don't understand.  Which particular case of program flow will
> 'client_socket' be invalid and yet make it through the socket.accept
> call?

Hi Marilyn,

If there is an EOFError or an KeyboardInterrupt, client_socket will
maintain the same value as the previous iteration through the whole loop.
What this potentially means is that, under strange situations, Spawn()
could be called on the same client_socket twice.

The issue is that the whole thing's in a 'while' loop, so we have to be
careful that the state from the previous loop iteration doesn't leak into
the current iteration.

[About using the Standard Library]

> And since then, please don't shoot me, but I don't immediately trust the
> modules.  I read them and see how many times they loop through the data,
> and how many copies of the data they put into memory -- and usually
> decide to write the simple things I need myself, looping zero times and
> keeping only one block in memory.

Hmm.. . Do you remember which Standard Library modules you were looking at
earlier?  Perhaps there was some funky stuff happening, in which case we
should try to fix it, so that no one else runs into the same problems.

> > ###
> > class FileReader(TokenReader):
> > def __init__(self, file_socket):
> > self.local_name = file_socket.local_name
> > self.fread~er_name = '/tmp/fsf%s' % self.local_name
> > file_lock.acquire()
> > self.freader = open(self.freader_name, "w+")
> > file_lock.release()
> > ###
> >
> > The locks around the open() calls are unnecessary: you do not need to
> > synchronize file opening here, as there's no way for another thread to
> > get into the same initializer of the same instance.
>
> But I think that I'm only wrapping the calls that create file
> descriptors, because those get trampled on.

I'm almost positive that the builtin open() function is thread-safe. Let
me check that really fast...

/*** Within Objects/fileobject.c: open_the_file() ***/
if (NULL == f->f_fp && NULL != name) {
Py_BEGIN_ALLOW_THREADS
f->f_fp = fopen(name, mode);
Py_END_ALLOW_THREADS
}
/**/

Hmmm!  If really depends if the underlying C's fopen() Standard C library
is thread safe.  This is true in modern versions of LIBC, so I don't think
there's anything to worry about here, unless you're using a very ancient
version of Unix.

> But, I did take out threading and the big error went away.  I'm done
> with threading, unless I see a big need one day.  I don't know what I'll
> tell students from here on.

I'd point them to John Ousterhout's article on "Why Threads are a Bad Idea
(For Most Purposes)":

http://home.pacbell.net/ouster/threads.pdf

> > I thought about this a little bit more: there is one place where you
> > do need locks.

[text cut]

> But, notice that the call to:
>
> threading.Thread.__init__(self, name = self.local_name)
>
> came after, so the Spawn.no manipulation always happens in the main
> thread.

You're right! I must have been completely delirious at that point.
*grin*

> One problem remains after removing the threading stuff.  I still get
> those pesky:
>
> close failed: [Errno 9] Bad file descriptor
>
> even though my logged openings and closings match up one-to-one.

Ok, then that's a good thing to know: since threading is off, we now know
that that error message has nothing to do with threads.

> Now then, I haven't added that line of code to each except clause yet
> because 1) it doesn't seem to cause any problem 2) it's a bunch of
> busy-work and 3) I'm hoping that, when your illness clears, you'll think
> of something less brute-force for me to do.

I'd recommend using traceback.format_exc().  It's infuriating to get an
error message like that, and not to know where in the world it's coming
from.

Python's default behavior for exceptions is to print out a good stack
trace: one of the best things about Python's default exception handler it
is that it gives us a local picture of the error.  For the most part, we
know around what line number we should be looking at.

When we write our own except handlers, the responsibility falls on us to
record good error messages.  If anything, our own debugging systems should
be even better than Python's.

So ma

Re: [Tutor] A somewhat easier way to parse XML

2005-01-19 Thread Danny Yoo

On Wed, 19 Jan 2005, Max Noel wrote:

>   I've just spent the last few hours learning how to use the DOM XML
> API (to be more precise, the one that's in PyXML), instead of revising
> for my exams :p. My conclusion so far: it sucks (and so does SAX because
> I can't see a way to use it for OOP or "recursive" XML trees).

Hi Max,

You are not alone in this restless feeling.

In fact, Uche Ogbuji, one of the lead developers of 4Suite and Amara
(which Kent mentioned earlier), just wrote a blog entry about his
malcontent with the DOM.  Here, these may interest you:

http://www.oreillynet.com/pub/wlg/6224
http://www.oreillynet.com/pub/wlg/6225

> In fact, I find it appalling that none of the "standard" XML parsers
> (DOM, SAX) provides an easy way to do that (yeah, I know that's what
> more or less what the shelve module does, but I want a
> language-independent way).

For simple applications, the 'xmlrpclib' has two functions (dumps() and
loads()) that we can use:

http://www.python.org/doc/lib/node541.html

For example:

###
>>> s = xmlrpclib.dumps(({'hello': 'world'},))
>>> print s

hello
world

>>>
>>>
>>> xmlrpclib.loads(s)
(({'hello': 'world'},), None)
###

A little bit silly, but it does work.  The nice thing about this that
xmlrpc is pretty much a platform-independent standard, so if we're dealing
with simple values like strings, integers, lists, and dictionaries, we're
all set.  It is a bit verbose, though.

Amara looks really interesting, especially because they have tools for
doing data-binding in a Python-friendly way.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Popen and sending output to a file

2005-01-19 Thread Danny Yoo



On Wed, 19 Jan 2005, Ertl, John wrote:

> I am using the subprocess.Popen from 2.4.  I can launch a job from
> python and the output from the job goes to the screen but now I would
> like to have the output go to a file.  I could do the crude
>
> subprocess.Popen("dtg | cat > job.out", shell=True)
>
> But I would think there is a better way built into Popen but I could not
> figure out the documentation.

Hi John,


According to:

http://www.python.org/doc/lib/node227.html

we can redirect standard input, output, and error by calling Popen with
the 'stdin', 'stdout', or 'stderr' keyword arguments.

We should be able to do something like:

###
job_output = open("job.out", "w")
subprocess.Popen("dtg", shell=True, stdout=job_output)
###


Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sockets, files, threads

2005-01-19 Thread Danny Yoo

On Wed, 19 Jan 2005, Marilyn Davis wrote:

> class Exim:
>  def __init__(self):
>  self.fdin = None
>  self.err_flush = []
>  self.stdout, self.stdin, self.stderr  = popen2.popen3('%s -t' % 
> MAILER)
>  self.fdin = self.stdin.fileno()
>  self.fdout = self.stdout.fileno()
>  self.fderr = self.stderr.fileno()

Hi Marilyn,

You probably don't need to explicitly use the file descriptors here. I see
that you're using them because of the use of select() later on:

###
sel = select.select(self.outlist, [], [], 1)
###

but, although select.select() can use file descriptor numbers, it can also
take file-like objects.  See:

http://www.python.org/doc/lib/module-select.html

So you can simplify this code to:

###
sel = select.select([self.stdout, self.stderr], [], [], 1)
###

The upside to this is that you don't need to use the low-level os.read()
or os.close() functions at all.

I suspect that some silliness with file descriptors is causing your bug,
as the following:

###
>>> os.close(42)
Traceback (most recent call last):
  File "", line 1, in ?
OSError: [Errno 9] Bad file descriptor
###

shows that if we feed os.close() a nonsense file descriptor, it'll throw
out a error message that you may be familiar with.

The error can happen if we close the same file descriptor twice:

###
>>> myin, myout, myerr = popen2.popen3('cat')
>>> os.close(myin.fileno())
>>> os.close(myin.fileno())
Traceback (most recent call last):
  File "", line 1, in ?
OSError: [Errno 9] Bad file descriptor
###

Here is another particular case that might really be relevant:

###
>>> myin, myout, myerr = popen2.popen3('cat')
>>> os.close(myin.fileno())
>>> myin.close()
Traceback (most recent call last):
  File "", line 1, in ?
IOError: [Errno 9] Bad file descriptor
###

We're getting an IOError here for exactly the same reasons: the high-level
'myin' object has no knowledge that its underlying file descriptor was
closed.

This is exactly why we've been saying not to mix file-descriptor stuff
with high-level file-like object manipulation: it breaks the assumptions
that the API expects to see.

> However, all that said, I do dimly remember that poplib perhaps had some
> extra processing that maybe is not needed -- but I could be wrong.

Ok; I'll try to remember to look into poplib later and see if there's
anything silly in there.

> if __name__ == '__main__':
> msg = '''To: %s
>
>  xx''' % TO_WHOM
>
>  p = Exim()
>  p.write(msg)
>  del p

Instead of using __del__ implicitely, drop the __del__ method and try
calling Exim.close_up()  explicitely:

###
p = Exim()
p.write(msg)
p.close_up()
###

> Are you well yet?  A lot of sick people around these days!

Feeling better.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Ooer, OT Lisp

2005-01-21 Thread Danny Yoo

On Thu, 20 Jan 2005, Bill Mill wrote:

> There is no "standard" implementation of lisp, so sockets and os access
> all vary by implementation. Furthermore, the docs are sketchy and hard
> to read with all of the lisps I've tried.

Hi Liam,

Scheme is a recent dialect of Lisp that seems to be well-regarded.
DrScheme is one of the very active implementations of Scheme:

http://www.drscheme.org/

and has a comprehensive set of documentation:

http://download.plt-scheme.org/doc/

> > 5) Are you able to point me towards a simplified explanation of how
> > the 'syntaxless' language can write programmes?

Brian mentioned one of my favorite books: "Structure and Interpretation of
Computer Programs":

http://mitpress.mit.edu/sicp/

If you want to do a crash course into how Lisp-languages work, I can't
think of a faster way than to look at the first few pages of it.

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-10.html

pretty much shows the core of Lisp programs.  There's even a set of video
lectures from the SICP authors that's freely available:

http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/

The neat thing about Scheme is that, once you get beyond the syntax, it
starts to feel a bit like Python.  *grin* Perhaps that should be the other
way around.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Importing multiple files as a single module?

2005-01-21 Thread Danny Yoo



On Fri, 21 Jan 2005, Kent Johnson wrote:

> I think this will work:
> in foo/__init__.py put
> from Bar import Bar
> from Baz import Baz
>
> or whatever variations of this you like.
>
> Any names defined in the package __init__.py are available to other code
> as package attribute.


Hi Max,

For more information on packages, we can take a look at:

http://www.python.org/doc/tut/node8.html#SECTION00840


Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Combination

2005-01-21 Thread Danny Yoo

On Fri, 21 Jan 2005, Guillermo Fernandez Castellanos wrote:

> I'm trying to take a list and find all the unique combinations of that
> list.
>
> I mean:
> if I enter (1,2,3,4,5) and I watn combinations of 3, I want to find:
> (1,2,3) but then not (2,1,3), (3,1,2),...
> (1,2,4)
> (1,2,5)
> (2,3,5)
> (3,4,5)

Hi Guillermo,

There is a clean recursive way to define this.  I'll try to sketch out how
one can go about deriving the function you're thinking of.

Let's say that we have a list L,

 [1, 2, 3]

and we want to create all combinations of elements in that list.  Let's
call the function that does this "createComb()".  How do we calculate
createComb([1, 2, 3])?

We can just start writing it out, but let's try a slightly different
approach.  Imagine for the moment that we can construct createComb([2,
3]):

createComb([2, 3])   -> [[2, 3], [2], [3]]

We know, just from doing things by hand, that createComb([1, 2, 3]) of the
whole list will look like:

createComb([1, 2, 3) -> [[1, 2, 3], [1, 2], [1, 3],
[2, 3],[2],[3]]

If we compare createComb([1, 2, 3]) and createComb([2, 3]), we might be
able to see that they're actually very similar to each other.

Does this make sense so far?  Please feel free to ask questions about
this.

Good luck!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Combination

2005-01-21 Thread Danny Yoo

On Fri, 21 Jan 2005, Danny Yoo wrote:

> > I mean:
> > if I enter (1,2,3,4,5) and I watn combinations of 3, I want to find:
> > (1,2,3) but then not (2,1,3), (3,1,2),...
> > (1,2,4)
> > (1,2,5)
> > (2,3,5)
> > (3,4,5)
>
>
> There is a clean recursive way to define this.

Hi Guillermo,

Gaaa; I screwed up slightly.  *grin*

I just wanted to clarify: the function that I'm sketching out is not
exactly the function that you're thinking of.  I'm doing more of a "give
me all the subsets of L" kind of thing.

But if you can understand how to get all the subsets, you should be able
to figure out how to get all the combinations of 'k' elements, because
they're really similar problems.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] read line x from a file

2005-01-22 Thread Danny Yoo



On Sat, 22 Jan 2005, Kent Johnson wrote:

> Jay Loden wrote:
> > One simple solution is to do:
> >
> > fle = open(file)
> > contents = file.readlines()
> > file.close()
> > print contents[x]  #or store this in a variable, whatever
>
> That is the simplest solution. If your file gets bigger and you don't
> want to read it all at once, you can use enumerate to iterate the lines
> and pick out the one you want:
>
> f = open(...)
> for i, line in enumerate(f):
>if i==targetLine:
>  print line # or whatever
>  break
> f.close()


Hi everyone,

Here's a cute variation for folks who feel comfortable about iterators:

###
f = open(...)
interestingLine = itertools.islice(f, targetLine, None).next()
f.close()
###


This uses the remarkable 'itertools' library to abstract away the explicit
loop logic:

http://www.python.org/doc/lib/itertools-functions.html


For example:

###
>>> f = open('/usr/share/dict/words')
>>> print repr(itertools.islice(f, 31415, None).next())
'cataplasm\n'
>>> f.close()
###

This shows us that /usr/share/dict/word's 31416th word is 'cataplasm',
which we can double-check from the Unix shell here:

###
[EMAIL PROTECTED] dyoo]$ head -n 31416 /usr/share/dict/words | tail -n 1
cataplasm
###



Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Python Scripting

2005-01-22 Thread Danny Yoo

On Sat, 22 Jan 2005, Ali Polatel wrote:

>I want to ask you something that I am really curious about.
>Can I design web-pages with python or use py files to write html?

Hi Ali,

Almost every programming language allows us to write strings into files,
so from an academic standpoint, 'yes'.  *grin*

>From a practical one, it also looks like 'yes', because there's quite a
few good libraries to do this kind of HTML writing.  Python.org has a
catalog of these "HTML Preprocessors" here:

http://www.python.org/moin/WebProgramming

>if the answer is yes and if I upload some scripts to the web-site with* .py
>does someone have to ahve python interepreter in his computer to be
>able to view those pages?

Since the WWW provides communication between clients and servers,
programmers have developed two main strategies for playing with the WWW.

o.  Client-side programming: this means that the server will send over
Python programs; clients then are responsible for running those programs.
This means that the clients have to have Python installed.  This also
means that the server doesn't necessarily have to have Python installed.
Less work on the server, and more work on the clients.

o.  Server-side programming: this means that the server will run
Python programs on the server side, and send the program's printed output
to the client.  This means that the clients don't need to have Python
installed.  More work on the server, and less work on the clients.

A lot of systems these days are opting toward server-side programming
because it's more transparently easy for clients to use.  CGI programming
is server-side.  In fact, as long as the server can run Python, the
clients don't even have to know that it's Python that's generating a web
page.  As a concrete example: people who visit a url like:

http://maps.yahoo.com/

may not even realize that (at least from April 2001) the maps.yahoo.com
web site was being run on Python programs.
(http://mail.python.org/pipermail/tutor/2001-April/004761.html)

> And is it possible to write a programme with python which will update a
> webpage by the commands it receives from a server ( I mean I am writing
> a bot for a chess server and I want it to be able to update a web-page
> about the last results of games etc.)

Yes.  In fact, you probably don't need to worry about CGI at all for your
particular project.  You can just write a program that writes out a static
HTML page.  If you run that program every so often, the generated page
will appear to be updating periodically.

For example, something like:

###
import datetime
f = open("last_time.html", "w")
f.write("""
The last time the program was called was at %s
""" % datetime.datetime.today())
f.close()
###

will write out a web page 'last_time.html' whose content should change
every time we run the program.  So what you know already should be enough
to write a program to track server statistics: your chess bot can
periodically write updates as an HTML file.

If you want a program to run every time a person visits a web page --- to
generate a page on demand --- then that dynamic approach requires a little
more work, and that's where web programming systems like CGI comes into
play. The topic guide here:

http://www.python.org/topics/web/

should have some tutorials that you can look at.

If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Changing (Unix) environment for python shell/popen() commands

2005-01-23 Thread Danny Yoo

On Sun, 23 Jan 2005, Scott W wrote:

> I've got to shell out from my python code to execute a command, but
> _must_ set the environment at the same time (or prior to execution).
>
> I saw some comments about setting os.environ[], but
> didn't seem to be seeing this work in subsequent calls using popen2().
>
> Does anyone have a working piece of code setting some env variable
> successfully prior to(or along with) calling popen2() (or friends).

Hi Scott,

Hmm... That should have worked.  We should be able to assign to the
os.environ dictionary: as I understand it, a child process can inherit its
parent's environment.  Let me check this:

###
volado:~ dyoo$ cat print_name_var.py
import os
print "I see", os.environ["NAME"]
volado:~ dyoo$
volado:~ dyoo$
volado:~ dyoo$
volado:~ dyoo$ python print_name_var.py
I see
Traceback (most recent call last):
  File "print_name_var.py", line 2, in ?
print "I see", os.environ["NAME"]
  File "/sw/lib/python2.3/UserDict.py", line 19, in __getitem__
def __getitem__(self, key): return self.data[key]
KeyError: 'NAME'
###

Ok, that's expected so far.

Let me see what happens if we try to set to os.environ.

###
volado:~ dyoo$ python
Python 2.3.3 (#1, Aug 26 2004, 23:05:50)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ['name'] = 'Scott'
>>> os.popen('python print_name_var.py').read()
Traceback (most recent call last):
  File "print_name_var.py", line 2, in ?
print "I see", os.environ["NAME"]
  File "/sw/lib/python2.3/UserDict.py", line 19, in __getitem__
def __getitem__(self, key): return self.data[key]
KeyError: 'NAME'
'I see\n'
###

On thing we have to be careful of is case-sensitivity: it does matter in
the case of environmental variables.

###
>>> os.environ['NAME'] = 'Scott'
>>> os.popen('python print_name_var.py').read()
'I see Scott\n'
###

And now it works.

> Also, is there any way to get the called programs return code via
> popen() (again, under python 1.5.2)?

Hi Scott,

Yes; although it's a little hard to find it, the Python 1.52 documentation
here:

http://www.python.org/doc/1.5.2p2/

does have some documentation on popen2:

http://www.python.org/doc/1.5.2p2/lib/module-popen2.html

You can use a popen2.Popen3 instance, which should give you access to
program return values.

If you have more questions, please feel free to ask.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Danny Yoo

On Tue, 25 Jan 2005, Scott Melnyk wrote:

> I have an file in the form shown at the end (please forgive any
> wrapparounds due to the width of the screen here- the lines starting
> with ENS end with the e-12 or what have you on same line.)
>
> What I would like is to generate an output file of  any other
> ENSE000...e-4 (or whathaveyou) lines that appear in more than one
> place and for each of those the queries they appear related to.

Hi Scott,

One way to do this might be to do it in two passes across the file.

The first pass through the file can identify records that appear more than
once.  The second pass can take that knowledge, and then display those
records.

In pseudocode, this will look something like:

###
hints = identifyDuplicateRecords(filename)
displayDuplicateRecords(filename, hints)
###

> My data set the below is taken from is over 2.4 gb so speed and memory
> considerations come into play.
>
> Are sets more effective than lists for this?

Sets or dictionaries make the act of "lookup" of a key fairly cheap.  In
the two-pass approach, the first pass can use a dictionary to accumulate
the number of times a certain record's key has occurred.

Note that, because your file is so large, the dictionary probably
shouldn't accumulation the whole mass of information that we've seen so
far: instead, it's sufficient to record the information we need to
recognize a duplicate.

If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Read file line by line

2005-01-25 Thread Danny Yoo

On Tue, 25 Jan 2005, Gilbert Tsang wrote:

> Hey you Python coders out there:
>
> Being a Python newbie, I have this question while trying to write a
> script to process lines from a text file line-by-line:
>
> #!/usr/bin/python
> fd = open( "test.txt" )
> content = fd.readline()
> while (content != "" ):
> content.replace( "\n", "" )
> # process content
> content = fd.readline()
>
> 1. Why does the assignment-and-test in one line not allowed in Python?
> For example, while ((content = fd.readline()) != ""):

Hi Gilbert, welcome aboard!

Python's design is to make statements like assignment stand out in the
source code.  This is different from Perl, C, and several other languages,
but I think it's the right thing in Python's case.  By making it a
statement, we can visually scan by eye for assignments with ease.

There's nothing that really technically prevents us from doing an
assignment as an expression, but Python's language designer decided that
it encouraged a style of programming that made code harder to maintain.
By making it a statement, it removes the possiblity of making a mistake
like:

###
if ((ch = getch()) = 'q') { ... }
###

There are workarounds that try to reintroduce assignment as an expression:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/202234

but we strongly recommend you don't use it.  *grin*

> 2. I know Perl is different, but there's just no equivalent of while
> ($line = ) { } ?

Python's 'for' loop has built-in knowledge about "iterable" objects, and
that includes files.  Try using:

for line in file:
...

which should do the trick.

Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Read file line by line

2005-01-25 Thread Danny Yoo



> There's nothing that really technically prevents us from doing an
> assignment as an expression, but Python's language designer decided that
> it encouraged a style of programming that made code harder to maintain.
> By making it a statement, it removes the possiblity of making a mistake
> like:
>
> ###
> if ((ch = getch()) = 'q') { ... }
> ###

hmmm.  This doesn't compile.  Never mind, I screwed up.  *grin*


But the Python FAQ does have an entry about this topic, if you're
interested:

http://python.org/doc/faq/general.html#why-can-t-i-use-an-assignment-in-an-expression

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] sorting a 2 gb file

2005-01-25 Thread Danny Yoo

On Tue, 25 Jan 2005, Max Noel wrote:

> >> My data set the below is taken from is over 2.4 gb so speed and
> >> memory considerations come into play.
> >>
> >> Are sets more effective than lists for this?
> >
> > Sets or dictionaries make the act of "lookup" of a key fairly cheap.
> > In the two-pass approach, the first pass can use a dictionary to
> > accumulate the number of times a certain record's key has occurred.
> >
> > Note that, because your file is so large, the dictionary probably
> > shouldn't accumulation the whole mass of information that we've seen
> > so far: instead, it's sufficient to record the information we need to
> > recognize a duplicate.
>
>   However, the first pass will consume a lot of memory. Considering
> the worst-case scenario where each record only appears once, you'll find
> yourself with the whole 2GB file loaded into memory.
>   (or do you have a "smarter" way to do this?)

Hi Max,

My assumptions are that each record consists of some identifying string
"key" that's associated to some "value".  How are we deciding that two
records are talking about the same thing?

I'm hoping that the set of unique keys isn't itself very large.  Under
this assumption, we can do something like this:

###
from sets import Set
def firstPass(f):
"""Returns a set of the duplicate keys in f."""
seenKeys = Set()
duplicateKeys = Set()
for record in f:
key = extractKey(record)
if key in seenKeys:
duplicateKeys.add(key)
else:
seenKeys.add(key)
return duplicateKeys
###

where we don't store the whole record into memory, but only the 'key'
portion of the record.

And if the number of unique keys is small enough, this should be fine
enough to recognize duplicate records.  So on the second passthrough, we
can display the duplicate records on-the-fly.  If this assumption is not
true, then we need to do something else.  *grin*

One possibility might be to implement an external sorting mechanism:

http://www.nist.gov/dads/HTML/externalsort.html

But if we're willing to do an external sort, then we're already doing
enough work that we should really consider using a DBMS.  The more
complicated the data management becomes, the more attractive it becomes to
use a real database to handle these data management issues.  We're trying
to solve a problem that is already solved by a real database management
system.

Talk to you later!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Advise...

2005-01-26 Thread Danny Yoo

On Wed, 26 Jan 2005, Bill Kranec wrote:

> There has been alot of talk on this list about using list comprehensions
> lately, and this could be one of those useful places.  While I don't
> have time to experiment with real code, I would suggest changing your
> function to look like:
>
> steps = [ min_x + i*delta_x for i in range(steps) ]
> totalarea = sum([ eval(func_x)*delta_x for x in steps ])
>
> Since list comprehensions are significantly faster than while loops,
> this could be a big speed boost.
>
> There may be a mistake or two in the above code, but hopefully the idea
> will be helpful.

Calling eval() there in the inner loop might be costly, because Python
needs to do extra work to tokenize and parse the string, every time
through the iteration.  We want to reduce the work done in tight inner
loops like that.

We can do some of that work up front by compiling the code.  Here's some
hacky code to do the compilation up front:

###
>>> def makeFunction(expressionString):
... compiledExp = compile(expressionString, 'makeFunction', 'eval')
... def f(x):
... return eval(compiledExp, {}, {'x' : x})
... return f
...
###

Some of the stuff there is a bit obscure, but the idea is that we get
Python to parse and compile the expression down once.  Later on, we can
evaluation the compiled code, and that should be faster than evaluating a
string.

Once we have this, we can use it like this:

###
>>> myFunction = makeFunction("3*x*x")
>>> myFunction(0)
0
>>> myFunction(1)
3
>>> myFunction(2)
12
>>> myFunction(3)
27
###

So although makeFunction()'s internals are weird, it shouldn't be too hard
to treat it as a black box.  *grin*

Let's see how this performs against that 3x^2 expression we saw before.
The original approach that calls eval() on the string takes time:

###
>>> def timeIt(f, n=1000):
... start = time.time()
... for i in xrange(n):
... f(i)
... end = time.time()
... return end - start
...
>>> def myFunctionOriginal(x):
... return eval("3*x*x")
...
>>> timeIt(myFunctionOriginal)
0.036462068557739258
###

The precompiled expression can work more quickly:

###
>>> timeIt(myFunction)
0.0050611495971679688
###

And we should still get the same results:

###
>>> for i in range(2000):
... assert myFunction(i) == myFunctionOriginal(i)
...
>>>
###

I hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Advise...

2005-01-26 Thread Danny Yoo



> > There has been alot of talk on this list about using list comprehensions
> > lately, and this could be one of those useful places.  While I don't
> > have time to experiment with real code, I would suggest changing your
> > function to look like:
> >
> > steps = [ min_x + i*delta_x for i in range(steps) ]
> > totalarea = sum([ eval(func_x)*delta_x for x in steps ])
>
> Calling eval() there in the inner loop might be costly, because Python
> needs to do extra work to tokenize and parse the string, every time
> through the iteration.  We want to reduce the work done in tight inner
> loops like that.
>
> We can do some of that work up front by compiling the code.  Here's some
> hacky code to do the compilation up front:
>
> ###
> >>> def makeFunction(expressionString):
> ... compiledExp = compile(expressionString, 'makeFunction', 'eval')
> ... def f(x):
> ... return eval(compiledExp, {}, {'x' : x})
> ... return f
> ...
> ###


Oh!  There's a slightly simpler way to write that makeFunction():

###
>>> def makeFunction(s):
... return eval("lambda x: " + s)
...
###



It even has a little less overhead than the previous code.


###
>>> timeIt(myFunctionOriginal)
0.035856008529663086
>>>
>>> timeIt(makeFunction("3*x*x"))
0.00087714195251464844
###



Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Unique Items in Lists

2005-01-26 Thread Danny Yoo

On Wed, 26 Jan 2005, Srinivas Iyyer wrote:

> I have a list with 4 columns and column1 elements are unique.  I wanted
> to extract unique elements in column3 and and place the other elements
> of the column along with unique elements in column 4 as a tab delim
> text.
>
> Table:
>
> col1col2col3   col4
> A   Apple 5Chennai
> B   Baby 11Delhi
> I   Baby* 1Delhi
> M   Dasheri+  5Mumbai
> K   Apple 12   Copenhagen

[Meta: we seem to be getting a run of similar questions this week. Scott
Melnyk also asked about grouping similar records together:
http://mail.python.org/pipermail/tutor/2005-January/035185.html.]

Hi Srinivas,

I see that you are trying to group records based on some criterion.  You
may find the problem easier to do if you fist do a sort on that criterion
column: that will make related records "clump" together.

For your sample data above, if we sort against the second column, the
records will end up in the following order:

###
A   Apple 5Chennai
K   Apple 12   Copenhagen
B   Baby  11   Delhi
I   Baby  1Delhi
M   Dasheri   5Mumbai
###

In this sorting approach, you can then run through the sorted list in
order.  Since all the related elements should be adjacent, grouping
related lines together should be much easier, and you should be able to
produce the final output:

###
Apple A,K 5,12Chennai,Copenhagen
Baby  B,I 1,11Delhi
Dasheri   M   5   Mumbai
###

without too much trouble.  You can do this problem without dictionaries at
all, although you may find the dictionary approach a little easier to
implement.

> A dictionary option does not work

A dictionary approach is also very possible.  The thing you may be stuck
on is trying to make a key associate with multiple values.  Most examples
of dictionaries in tutorials use strings as both the keys and values, but
dictionaries are more versatile: we can also make a dictionary whose
values are lists.

For example, here is a small program that groups words by their first
letters:

###
>>> def groupAlpha(words):
... groups = {}
... for w in words:
... firstLetter = w[0]
... if firstLetter not in groups:
... groups[firstLetter] = []
... groups[firstLetter].append(w)
... return groups
...
>>> groupAlpha("this is a test of the emergency broadcast system".split())
{'a': ['a'],
 'b': ['broadcast'],
 'e': ['emergency'],
 'i': ['is'],
 'o': ['of'],
 's': ['system'],
 't': ['this', 'test', 'the']}
###

If you have more questions, please feel free to ask.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Convert string to variable name

2005-01-26 Thread Danny Yoo

On Wed, 26 Jan 2005, Tony Giunta wrote:

> This is something I've been trying to figure out for some time.  Is
> there a way in Python to take a string [say something from a raw_input]
> and make that string a variable name?

Hi Tony,

Conceptually, yes: you can collect all those on-the-fly variables in a
dictionary.

> I'm sure I could accomplish something similar using a plain old
> dictionary,

Yup.  *grin*

> but I was playing around with the OOP stuff and thought it might be a
> neat thing to try out.

We can use exec() to create dynamic variable names, but it's almost always
a bad idea.  Dynamic variable names make it very difficult to isolate
where variables are coming from in a program.  It'll also prevent us from
reliably using the excellent PyChecker program:

http://pychecker.sourceforge.net/

So I'd recommend just using a plain old dictionary: it's not exciting, but
it's probably the right thing to do.

Looking back at the code:

###
def generateInstance():
class test:
def __init__(self, title, price):
self.title = title
self.price = price
def theTitle(self, title):
return self.title
def thePrice(self, price):
return self.price

myName = raw_input("Name: ")
myTitle= raw_input("Title: ")
myPrice  = raw_input("Price: ")

exec '%s = test("%s", %s)' % (myName, myTitle, myPrice)
###

In Python, we actually can avoid writing attribute getters and setters ---
the "JavaBean" interface --- because Python supports a nice feature called
"properties".  See:

http://dirtsimple.org/2004/12/python-is-not-java.html
http://www.python.org/2.2.2/descrintro.html#property

The upside is that we usually don't need methods like theTitle() or
thePrice() in Python.  Our revised code looks like:

###
class test:
def __init__(self, title, price):
self.title = title
self.price = price

instances = {}

def generateInstance():
myName  = raw_input("Name: ")
myTitle = raw_input("Title: ")
myPrice = raw_input("Price: ")
instances[myName] = test(myTitle, myPrice)
###

If you have more questions, please feel free to ask.  Best of wishes to
you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Safely buffering user input

2005-01-27 Thread Danny Yoo

On Thu, 27 Jan 2005, Miles Stevenson wrote:

> I'm trying to practice safe coding techniques. I just want to make sure
> that a user can't supply a massive argument to my script and cause
> trouble. I'm just trying only accept about 256 bytes:
>
> buffer(sys.argv[1], 0, 256)
  ^^

Hi Miles,

Don't use buffer() in this way: it's not doing what you think it's doing.
buffer() does not "mutate" its argument: it does not truncate sys.argv[1].
Instead, it takes an existing sequence and provides a sort of "window"
view into that sequence.

You can try something like:

###
window = buffer(sys.argv[1], 0, 256)
###

in which case 'window' here should contain the first 256 characters of
sys.argv[1].

As a side note, buffer() does not appear to be used much by people: it's
hardly used by the Standard Library itself, and is probably not useful for
general Python programming.  (In fact, the only place I see buffer()
really being used is in Python's test cases!)

In the case with sys.argv[1], I'd actually leave string arguments at an
unrestricted length.  Python is very safe when it comes to dealing with
large data.  For example, array access is checked at runtime:

###
>>> "hello world"[400]
Traceback (most recent call last):
  File "", line 1, in ?
IndexError: string index out of range
###

This is why we're not so worried about things like buffer-overflow in
unsafe languages like C.

If you have more questions, please feel free to ask.  Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Please Help with Python Script for Xbox Media Center

2005-01-28 Thread Danny Yoo

On Fri, 28 Jan 2005, Rodney Butler wrote:

> I have written a script that is designed to load when you start your
> xbox, it gets a list of your music playlists then lets you select which
> one to load, shuffle and play, then the script exits.  What I need is
> for the script to automatically select the highlighted playlist if there
> is no user input for 10 seconds.

Hi Rodney,

It appears that you're using some kind of GUI interface called 'xbmcgui'.
Only a few of us have done xbmc stuff; you may have better luck asking
from a XBMC-specific game forum.

You can set up a thread to do something in the background;  I did a quick
Google search and picked out:

http://www.ek-clan.vxcomputers.com/flashscripts/AQTbrowser.py

This appears to use threads to implement a http timeout, and you can
probably adapt this to automatically select a playlist if the user hasn't
selected one in time.

Without having an XBOX, however, I have no way of testing this. So we
recommend you talk with someone on one of the forums.

Oh!  I see that you've just posted on:

http://www.xboxmediaplayer.de/cgi-bin/forums/ikonboard.pl?s=0d6bed0f9584b8406d5f458ffaeb187c;act=ST;f=21;t=10108

and have gotten a similar advice to use threads.  Ok, good.  Use threads.
*grin*

If you have more questions, please feel free to ask.  But the folks on
that forum probably know better than us, so try them first.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] range function

2005-01-29 Thread Danny Yoo

On Sat, 29 Jan 2005, Srinivas Iyyer wrote:

> I have bunch of coordinates for various vectors.
>
> small vecs:
>
> name  cord. Xcord. Y   Sector no.
> smvec175  1001aa
> smvec225  50 1aa
> smvec3135 1551ab
>
> large vecs:zone
> Lvec1 10  50  1aa  ut
> Lvec1 60  110 1aa  cd
> Lvec1 130 180 1ab  cd
>
> Now I am checking if small vecs are falling in which
> large vecs.

Hi Srivivas,

Some of the problem statement is confusing me slightly, since there's more
infomation here than just "vector" information.

Is it accurate to say that the essential part of the data is something
like this?

###
small_vecs = [ ("smvec1", 75, 100, "1aa"),
   ("smvec2", 25, 50, "1aa"),
   ("smvec3", 135, 155, "1ab") ]

large_vecs = [ ("Lvec1", 10, 50, "1aa", "ut"),
   ("Lvec1", 60, 110, "1aa", "cd"),
   ("Lvec1", 130, 180, "1ab", "cd") ]
###

Or do you really just care about a portion of the data?

###
small_vecs = [ ("smvec1", 75, 100),
   ("smvec2", 25, 50,),
   ("smvec3", 135, 155) ]

large_vecs = [ ("Lvec1", 10, 50),
   ("Lvec1", 60, 110),
   ("Lvec1", 130, 180) ]
###

I'm just trying to digest what part of the program we're trying to solve.
*grin*

Rather than work on text files directly as part of the algorithm, it might
be easier to split the problem into two pieces:

Part 1.  A function that takes the text file and turns it into a data
structure of Python tuples, lists, and numbers.

Part 2.  A function to do the matching against those data structures.

The reason this breakup might be useful is because you can test out the
vector-matching part of the program (Part 2) independently of the
file-reading-parsing part (Part 1).

And parsing files is sometimes really messy, and we often want to keep
that messiness localized in one place.  As a concrete example, we probably
need to do something like:

cols = line.split('\t')
(x, y) = (int(cols[1]), int(cols[2]))

where we have to sprinkle in some string-to-int stuff.

> The other way by taking tuples:
>
> for line in smallvecs:
>  cols = line.split('\t')
>  smvec_tup = zip(cols[1],cols[2])

zip() is probably not the best tool here.  We can just write out the tuple
directly, like this:

smvec_tup = (cols[1], cols[2])

zip() is meant for something else: here's an example:

###
>>> languages = ["Python", "Perl", "Java", "Ocaml", "C"]
>>> prefixes = ["py", "pl", "java", "ml", "c"]
>>> languages_and_prefixes = zip(languages, prefixes)
>>>
>>> print languages_and_prefixes
[('Python', 'py'), ('Perl', 'pl'), ('Java', 'java'), ('Ocaml', 'ml'),
 ('C', 'c')]
###

So zip() is more of a bulk-tuple constructing function: it's not really
meant to be used for making just a single tuple.  It takes in lists of
things, and pairs them up.

Please feel free to ask more questions about this.  Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] question regarding python exception handling

2005-01-30 Thread Danny Yoo

On Sat, 29 Jan 2005, Roy wrote:

> I am learning about python exception handling. I am reading "Python in a
> Nutshell". In the chapter of exception handling, it says: Note that the
> try/finally form is distinct from the try/except form: a try statement
> cannot have both except and finally clauses, as execution order might be
> ambiguous.

Hi Roy,

Caveat: what I write below might not actually be right!  Acording to amk:

   http://www.amk.ca/python/writing/warts.html

under the header "Catching Multiple Exceptions", the reason we can't do it
is just pure laziness from Python's implementors.  So I might be inventing
a line of reasoning that isn't the language designer's original intention.
*grin*

Let's take a look at a bit of Java code, and then compare that to Python.

/*** Java example ***/
PreparedStatement stmt = conn.prepareStatement
("select name from pub_article");
try {
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
System.out.println(rs.getString(1));
}
} catch(SQLException e) {
e.printStackTrace();
} finally {
stmt.close()
}
/*/

Here is a rough translation of that code in Python:

### Python example ###
cursor = conn.cursor()
try:
try:
cursor.execute("select name from pub_article")
for (name,) in cursor.fetchall():
print name
except DatabaseError:
traceback.print_exc()
finally:
cursor.close()
##

> I don't understand the reason why except and finally clauses cannot be
> together. I know they can be together in java. how does it cause
> ambiguous execution order? An example may help me understand.

The block structure in the Java pseudocode:

/**/
try {
// just some code
} catch(...) {
// exception-handling block
} finally {
// finally block
}
/**/

implies a flow-of-control that is a bit deceptive.  In Java, the visual
block structure implies that 'catch' and 'finally' behave like peers, like
the "if/else if" or "switch/case" flow-of-control statements.

We know that when an exception happens, the flow of control does this
thing where it goes into "exception-handling block", and then weaves
around into the "finally block".  But we also hit the finally block even
if no exception occurs.

And that's where a possible confusion can happen: the 'finally' block
visually looks like a peer to the other 'catch' blocks, but it serves an
entirely different kind of behavior.  Conceptually, the 'finally' block
"wraps"  around the behavior of the 'catch': it fires off regardless if we
hit a catch block or not.

Python's syntax makes that the flow of control visible through its block
nesting:

###
try:
try:
...
except ...:
   ...
finally:
...
###

Here, the block structure makes it clear that the inner try/except stuff
is in the context of a try/finally block.  This is one case where Python
is more verbose than Java.

I hope this sounds reasonable.  *grin*  Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Using exec with dict

2005-01-30 Thread Danny Yoo

On Mon, 31 Jan 2005, [ISO-8859-1] André Roberge wrote:

> >/ I have a "robot" that can do some actions like "move()" and
> />/ "turn_left()".  I can program this robot using python like this:
> />/ 
> />/ .def move_and_turn():
> [snip]//
> />/ The question I have is: how do I do this with an explicit dictionary.
> />/ I would *guess* that this is somehow equivalent to "how do I create a
> />/ dictionary that has access only to robot instructions [move(),
> />/ turn_left(), etc.] and Python's basic syntax" ... but I don't know how
> />/ to do this.
> /
> myGlobals = { 'move':move, 'turn_left':turn_left }
> exec code in myGlobals
>
> You don't need to add built-ins to myGlobals. Add whatever of your
> symbols you want the code to have access to.

Hello!

There's one subtlety here: exec() (as well as eval()) will automagically
stuff in its own version of a '__builtins__' dictionary if a binding to
'__builtins__' doesn't exist.  Here's a snippet from the documentation
that talks about this:

"""If the globals dictionary is present and lacks '__builtins__', the
current globals are copied into globals before expression is parsed. This
means that expression normally has full access to the standard __builtin__
module and restricted environments are propagated."""

This is a bit surprising, and caught me off guard when I played with exec
myself.  There are more details here:

http://www.python.org/doc/ref/exec.html
http://www.python.org/doc/lib/built-in-funcs.html#l2h-23

So if we really want to limit the global names that the exec-ed code sees,
you may want to add one more binding to '__builtins__':

###
>>> d = {}
>>> exec "print int('7')" in d
7
>>>
>>> d.keys()
['__builtins__']
>>>
>>>
>>> d2 = {'__builtins__' : {}}
>>>
>>> exec "print int('7')" in d2
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 1, in ?
NameError: name 'int' is not defined
###

Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Append function

2005-01-31 Thread Danny Yoo

On Sun, 30 Jan 2005, kumar s wrote:

> I have a bunch of files where the first column is always the same. I
> want to collect all those files, extract the second columns by file wise
> and write the first column, followed by the other columns(extracted from
> files) next to each other.

Hi Kumar,

Can you show us an example of what you mean?

I think you're looking for something takes a list of line elements, like

###
lines = """
1   Alpha
1   Beta
1   Gamma
1   Delta
1   Epsilon
""".split('\n')
###

and combines input lines with the same first column into something like
this:

###
1   Alpha|Beta|Gamma|Delta|Epsilon
###

But I'm not positive if something like this is what you're trying to do.
You mentioned multiple files: can you show us a small, simple example with
two files?

An example like this will help clarify the problem, and help us know if
we're really answering your question.

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] files in a directory

2005-01-31 Thread Danny Yoo



> Now that I am reading many files at once, I wanted, to
> have a tab delim file op that looks like this:
>
> My_coors Int_file 1 Int_file2
> IntFile3
> 01:26   34  235
> 245.45
> 04:42  342.4452.445.5
> 02:56  45.4 34.5 557.8

[code cut]


Hi Kumar,

Ok, I think I see the source of the bug.  Let's check how the code writes
each of my_vals:

> for line in my_vals:
> f2.write(line+'\t') => asking for tab delim..
> f2.write('\n')

The problematic statement is the last one: between each of the values, the
code writes a newline as well as the tab character.

Push the newline-writing code outside of the loop, and you should be ok.


As a note: you can simplify the tab-writing code a bit more by using a
string's "join()" method.  For example:

###
>>> "/".join(["this", "is", "a", "test"])
'this/is/a/test'
###

So we can join a bunch of elements together without using an explicit
'for' loop.  In your program, you can use join() to create a large string
of the my_vals elements with tabs joining them.  If you'd like to see
more, the library documentation:

http://www.python.org/doc/lib/string-methods.html#l2h-192

mentions more information on the join() method and the other methods that
strings support.


Please feel free to ask more questions on any of this.  (But when you
reply, try to cut down on the number of lines that you quote from the
previous message.  I almost missed your question because I couldn't see it
at first!  *grin*)

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Newbie struggling with Tkinter/canvas tags

2005-01-31 Thread Danny Yoo

On Sun, 30 Jan 2005, Glen wrote:

> As a Python/Tkinter newbie, I thought I was getting on ok...
> then I hit this problem.
>
> I have a canvas (c1)
> A group of objects are drawn on c1 and given a tag
>   c1.addtag_all('group_A')

> Another group of objects are drawn, I wish to tag these 'group_B'.

Hi Glen,

Ok, so you're looking for something like a hypothetical "find_withouttag"
to define 'group_B' then.  I don't think such a function exists directly,
but we can make it up.

> At the moment I 'get by' with...
>
> a=c1.find_withtag('group_A')
> b=c1.find_all()
> c=b[len(a):]
> for i in range(len(c)):
> c1.addtag_above('group_B',len(a)+i)

There's one problem here: the system doesn't guarantee that all the
'group_A' elememts will end up at the beginning of 'b', so there's a very
strong possibility of tagging the wrong elements here.

I think that getting this right will take some more work.  Here's a
definition of a function called find_withouttag():

###
def find_withouttag(canvas, tag):
"""Returns all the objects that aren't tagged with 'tag'."""
all_objects = canvas.find_all()
tagged_objects = canvas.find_withtag(tag)
return subtract_instances(all_objects, tagged_objects)

def subtract_instances(l1, l2):
"""Returns a list of all the object instances in l1 that aren't in
   l2."""
identity_dict = {}
for x in l1:
identity_dict[id(x)] = x
for x in l2:
if id(x) in identity_dict:
del identity_dict[id(x)]
return identity_dict.values()
###

[Side note to other: the work that subtract_instances() does is similar to
what IdentityHashMap does in Java: does anyone know if there's already an
equivalent dictionary implementation of IdentityHashMap in Python?]

We can see how subtract_instances() works on a simple example:

###
>>> class MyClass:
... def __init__(self, name):
... self.name = name
... def __repr__(self):
... return str(self.name)
...
>>> frodo, elrond, arwen, aragorn, sam, gimli = (
...map(MyClass, "Frodo Elrond Arwen Aragorn Sam Gimli".split()))
>>> hobbits = frodo, sam
>>> everyone = frodo, elrond, arwen, aragorn, sam, gimli
>>>
>>> subtract_instances(everyone, hobbits)
[Arwen, Aragorn, Gimli, Elrond]
###

This subtract_instances() function should do the trick with instances of
the canvas items.  There's probably a simpler way to do this --- and there
are things we can do to make it more efficient --- but I have to go to
sleep at the moment.  *grin*

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Newbie struggling with Tkinter/canvas tags

2005-01-31 Thread Danny Yoo

On Mon, 31 Jan 2005, Danny Yoo wrote:

> I think that getting this right will take some more work.  Here's a
> definition of a function called find_withouttag():

[Code cut]

Oh!  Never mind; this can be a lot simpler.  According to the "Gotchas"
section of:

http://tkinter.unpythonic.net/wiki/Widgets/Canvas

the items in a Canvas are actually not object instances themselves, but
integers.  I made an assumption that canvas items were object instances,
so I wrote that subtract_instances() to take care of issues with them.

But if canvas items are really integers, then we don't need
subtract_instances() at all.  We can just use sets and subtract one set
from the other.  Here is a redefinition of find_withouttag() which should
work better:

###
from sets import Set
def find_withouttag(canvas, tag):
"""Returns all the objects that aren't tagged with 'tag'."""
all_objects = Set(canvas.find_all())
tagged_objects = Set(canvas.find_withtag(tag))
return all_objects - tagged_objects
###

Here's a small example with sets to make it more clear how this set
manipulation stuff will work:

###
>>> from sets import Set
>>> numbers = range(20)
>>> primes = [2, 3, 5, 7, 11, 13, 17, 19]
>>> Set(numbers) - Set(primes)
Set([0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18])
###

For more information on sets, see:

http://www.python.org/doc/lib/module-sets.html

Best of wishes to you!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Newbie struggling with Tkinter/canvas tags

2005-01-31 Thread Danny Yoo



> Now I've just got to work out how to tag a list of id's...  There
> doesn't seem to be a way to tag a known id, it has to be tagged by
> reference from an id above or below which seems odd!

Hi Glen,

Have you tried the addtag_withtag() method?  It looks like it should be
able to do what you're thinking of.  The documentation here:

http://tkinter.unpythonic.net/pydoc/Tkinter.Canvas.html

should talk about addtag_withtag().

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Comparing files, Counting Value

2005-01-31 Thread Danny Yoo



Hi Michiyo,


Ok, let's take a look at the code.

> i=open("file 1") #value data
> o=open("file 2") #look-up file
> l=open("result", 'w')#result

We strongly recommend renaming these names to ones that aren't single
characters.

It's difficult to tell here what 'i', 'o', and 'l' mean, outside of the
context of the assignments.  The letters 'i' and 'o' often stand for the
words "input" and "output".  The way that you're using these as variable
names for input files will break the expectation of people who read the
code.  Futhermore, 'l' can easily be misread as 'i'.

In short, those names should be changed to something readable. This is a
pure style issue, but I think it's important as programmers to make the
code easy for humans to understand.



> results={}
>
> line=i.readline()
> line=o.readline()

This looks problematic.  The 'line' name here, by the end of these two
statements, is bound to the value of the look-up file's line.  I believe
you need to keep those values distinct.



> while line:
>  fields=line.split()
>  x1=fields[0, 1] in i#x,y position in file 1
>  z=fields[2] in i   #value data in file 1
>  x2=fields[0, 1] in o   #x,y position in file 2


There are several fundamental issues with this.


Conventionally, a file-loop has the following structure:

###
for line in inputFile:
# ... do something with that line
###

and anything that tries to iterate across a file in a different way should
be looked at with some care.  The way that your code is trying to iterate
across the file won't work.



We strongly recommend you read through a tutorial like:

http://www.freenetpages.co.uk/hp/alan.gauld/tutfiles.htm

which has examples of how to write programs that work with files.



The way the program's structured also seems a bit monolithic.  I'd
recommend breaking down the problem into some phases:

Phase 1: read the value-data file into some data structure.

Phase 2: taking that data structure, read in the lookup file and
identify which positions are matching, and record matches in the
output file.

This partitioning of the problem should allow you to work on either phase
of the program without having to get it correct all at once.

The phases can be decoupled because we can easily feed in some kind of
hardcoded data structure into Phase 2, just to check that the lookup-file
matching is doing something reasonable.


For example:

###
hardcodedDataStructure = { (299, 189) : 8.543e-02,
   (300, 189) : 0.000e+00,
   (301, 189) : 0.000e+00,
   (1, 188)   : 5.108e-02
 }
###

is a very small subset of the information that your value data contains.

We can then take this hardcodedDataStructure and work on the second part
of the program using 'hardcodedDataStructure'.  Later on, when we do get
Phase 1 working ok, we can use the result of Phase 1 instead of the
hardcodedDataStructure, and it should just fit into place.

Does this make sense?  Don't try writing the program all at once and then
start trying to make it all work.  But instead, try building small simple
programs that do work, and then put those together.



The statements:

>  fields=line.split()
>  x1=fields[0, 1] in i#x,y position in file 1

won't work.


'fields[0, 1]' does not represent the first and second elements of
'fields': it means something else, but you should get errors from it
anyway.  For example:

###
>>> values = ["hello", "world", "testing"]
>>> values[0, 1]
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: list indices must be integers
###

So you should have already seen TypeErrors by this point of the program.


What you probably meant to write was:

###
fields = line.split()
x1 = (fields[0], fields[1])
###

Alternatively:

###
fields = line.split()
x1 = fields[0:1]
###

The significance of accidentely using the comma there won't make too much
sense until you learn about tuples and dictionaries, so I won't press on
this too much.



I'd recommend that you read through one of the Python tutorials before
trying to finishing the program.  Some of the things that your program
contains are... well, truthfully, a little wacky.  There are several good
tutorials linked here:

http://www.python.org/moin/BeginnersGuide/NonProgrammers


If you have more questions, please feel free to ask!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2125 matches

Mail list logo