Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Andrew P
If it makes you feel any better, this isn't an easy problem to get 100%
right, traditionally.  Heck, it might not even be possible. 
A series of compromises might be  the best you can hope for.

Some things to think about, however.  Can you choose the
characters you want, instead of the (many, many) characters you don't
want?  This might simplify matters.

What do most words look like in English?  When is a hyphen part of
a word, what about dashes?  A dash in the middle of a sentence
means something different than one at the end of line.

As far as other special/impossible cases...what is the difference
between dogs' and 'the dogs' when you are counting words?  What
about acronyms written like S.P.E.C.T.R.E, or words that include a
number like 1st?  You can add point 5 to the list.  Which are
more common cases, which collide with other rules.  What's the
minimum amount of rules you can define to take the maximum chunk out of
the problem?

That is enough of my random rambling.

A lot of it might magically fall into place once you try what Danny
suggested.  He is a smart guy.  Doing my best to have clearly
defined, self-contained functions that do a specific task usually helps
to reduce a problem to more manageable steps, and visualize what is
happening more clearly. 

Good luck,

Andrew
On 10/10/05, Dick Moores <[EMAIL PROTECTED]> wrote:
Script is at:Example text file for input:<
http://www.rcblue.com/Python/wordFrequency/first3000linesOfDavidCopperfield.txt>(142 kb)(from )Example output in file:
(40 kb)(Execution took about 30 sec. with my computer.)I worked on this a LONG time for something I expected to just be an easy
and possibly useful exercise. Three times I started completely over witha new approach. Had a lot of trouble removing exactly the characters Ididn't want to appear in the output. Wished I knew how to debug other
than just by using a lot of print statements.Specifically, I'm hoping for comments on or help with:1) How to debug. I'm using v2.4, IDLE on Win XP.2) I've tried to put in remarks that will help most anyone to understand
what the code is doing. Have I succeeded?3) No modularization. Couldn't see a reason to do so. Is there one or two?Specifically, what sections should become modules, if any?4) Variable names. I gave up on making them self-explanatory. Instead, I
put in some remarks near the top of the script (lines 6-10) that I hopedo the job. Do they? In the code, does the "L to newL to L to newL to L"kind of thing remain puzzling?(lines 6-10)# meaning of short variable names:
#   S is a string#   c is a character of a string#   L, F are lists#   e is an element of a list5) Ideally, abbreviations that end in a period, such as U.N., e.g., i.e.,viz. op. cit., Mr. (Am. E.), etc., should not be stripped of their final
periods (whereas other words that end a sentence SHOULD be stripped). Itried making and using a Python list of these, but it was too tough towrite the code to use it. Any ideas? (I can live very easily without a
solution to point 5, because if the output shows there are 10 "e.g"s,I'll just assume, and I think safely, that there actually are 10 "e.g."s.But I am curious, Pythonically.)Thanks very much in advance, tutors.
Dick Moores[EMAIL PROTECTED]___Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Alan Gauld
> I have been using winpdb recently, it is a pretty decent standalone Python 
> debugger for Windows. http://www.digitalpeers.com/pythondebugger/

It looks the part and I've downloaded a copy.

Thamks for the tip,

Alan g 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Walk a dictionary recursively

2005-10-11 Thread Negroup -
Hi tutors, in my application I found convenient to store all the data
in a data structure based on a dictionary containing a lot of keys,
and each of them host other dictionary with lists and dictionaries
nested inside and so on...

First of all I'd like to know if it is normal to use so complex data
structures in which store data, or if it possible in some way to
organize them using smaller "organizational units".

This is instead the problem I should solve as soon as possible: I
should apply a function, exactly the string's method decode('utf-8'),
to  each key and value of the above descripted dictionary. Consider
that the keys are integers or strings, and if a key is a list, I need
to decode each contained element. Is there a way to walk recursively
the dictionary, or should I write my own walk function?

Thanks in advance,
negroup
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Stopping function after given time

2005-10-11 Thread Raduz
On Monday 10 of October 2005 20:10, Oliver Maunder wrote:
> > >Simple question: Is it possible to stop a running function after certain
> > >predefined time?
> > >
> > >I would like to convert this routine to Python using pycurl module, or
> >
> > maybe
> >
> > >even standard urllib.urlretrieve.
>
> If you use urllib.urlretrieve, you can pass in a callback function that
> gets called after every few blocks are downloaded. In that function, you
> could check how much time has passed since the download started, and quit
> the download if necessary. You should be able to do this without threads.
> I'm not sure how you stop a download once it's started though!
>
> Olly

Thanks for an idea, I will look into it. I'm not sure how to stop running 
download, but at least I have something to start with :-)

-- 
Raduz
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
Thank you, Andrew, for your wise and thoughtful comments.

Andrew P wrote at 23:58 10/10/2005:
>If it makes you feel any better, this isn't an easy problem to get 100% 
>right, traditionally.  Heck, it might not even be possible.  A series of 
>compromises might be  the best you can hope for.

Yes, I gradually began to realize that. And after hearing from you, even 
more so.

>Some things to think about, however.  Can you choose the characters you 
>want, instead of the (many, many) characters you don't want?  This might 
>simplify matters.

I tried it both ways, and settled on specifying the characters I don't want.

chars = ".,!?;:&*'\"=\\+-><][/[EMAIL PROTECTED])("
and then
newWord = word.strip(chars)

This way I could first get rid of all those non-alphanumeric characters 
on the outsides of words, and leave alone the ones inside words, such as 
hyphens (man-eating), periods (S.P.E.C.T.R.E), and apostrophes (dog's). 
This seemed simpler and easier to visualize.

>What do most words look like in English?

Yes, I gave that a lot of thought.

>  When is a hyphen part of a word,
>  what about dashes?  A dash in the middle of a sentence means something 
> different than one at the end of line.

"A dash in the middle gets replaced by a space--creating two 
words."  Thus "space--creating" becomes  "space creating" (two words).

A dash at the end of a line is also removed.

Now hyphens are a different problem, one I didn't solve. "man-eating 
tiger":  A hyphen in the middle of a word gets the hyphen left in place 
("man-eating tiger"). All others would be removed. I decided to not try 
to handle those cases where a hyphen is used at the end of a line to 
place the first syllable or two of a long word at the end of a line, and 
the remaining syllables at the beginning of the next line. 
Impressionistically, it seemed to me that most text in digital form 
doesn't split words this way. Take, for example, articles on newspaper 
websites. In their tree-wasting form, with narrow columns, the reverse is 
true: many word-splitting hyphens at the ends of lines.

>As far as other special/impossible cases...what is the difference 
>between dogs' and 'the dogs' when you are counting words?

Dogs' stays as dogs'; 'the dogs' becomes the two words the and dogs'. 
Although the text in my example, David Copperfield, is BE, I am actually 
aiming at AE, where most quotes are double. Thus dogs' and "the dogs". 
For these, dogs' stays as the word dogs', and "the dogs" is the two 
words, the and dog. But now I'm not sure this is correct. I'd rather the 
possessive of dogs be treated as just another instance of dogs. But then 
possibly the plural dogs should be treated as an instance of dog. I 
realize now, thanks to you, that I didn't think this through sufficiently 
at all. Should knives be an instance of knife? And so on.

>   What about acronyms written like S.P.E.C.T.R.E

S.P.E.C.T.R.E --> s.p.e.c.t.r.e
I can live with the lower case.

>or words that include a number like 1st?

1st, 2nd remain as 1st, 2nd .

>  You can add point 5 to the list.  Which are more common cases, which 
> collide with other rules.  What's the minimum amount of rules you can 
> define to take the maximum chunk out of the problem?

I believe that's point 6. Yes, I've thought about that. But probably not 
well-enough.

>That is enough of my random rambling.
>
>A lot of it might magically fall into place once you try what Danny 
>suggested.  He is a smart guy.  Doing my best to have clearly defined, 
>self-contained functions that do a specific task usually helps to reduce 
>a problem to more manageable steps, and visualize what is happening more 
>clearly.

Yes, I'm certainly going to follow Danny's advice.

Dick

>On 10/10/05, Dick Moores <[EMAIL PROTECTED]> wrote:
>Script is at:
><http://www.rcblue.com/Python/wordFrequency/wordFrequency.txt>
>
>Example text file for input:
>< 
>http://www.rcblue.com/Python/wordFrequency/first3000linesOfDavidCopperfield.txt>
>(142 kb)
>(from 
><http://www.gutenberg.org/etext/766>)
>
>Example output in file:
><http://www.rcblue.com/Python/wordFrequency/outputToFile.txt>
>(40 kb)
>
>(Execution took about 30 sec. with my computer.)
>
>I worked on this a LONG time for something I expected to just be an easy
>and possibly useful exercise. Three times I started completely over with
>a new approach. Had a lot of trouble removing exactly the characters I
>didn't want to appear in the output. Wished I knew how to debug other
>than just by using a lot of print statements.
>
>Specifically, I'm hoping for comments on or help with:
>1) How to debug. I'm using v2.4, IDLE on Win XP.
>2) I've tried to put in remarks that will help most anyone to understand
>what the code is doing. Have I succeeded?
>3) No modularization. Couldn't see a reason to do so. I

Re: [Tutor] code improvement for beginner ?

2005-10-11 Thread Roel Schroeven
Kent Johnson wrote:
> Roel Schroeven wrote:
> 
>> Danny Yoo wrote:
>> 
>> 
>>> Looking at pageimgs(): I'm not sure what 't' means in the open
>>> statement:
>>> 
>>> f = open(filename, "rt")
>> 
>> It's not a typo. 't' opens the file in text mode.
> 
> 
> Are you sure? Is that documented anywhere?

I thought it was, but actually it seems it's not. I know it from fopen
in C on Windows; it is documented in MSDN.

> Text mode is the default, you have to specify the 'b' if you want
> binary mode.

Indeed, so it seems. I just always use 't' since I do that in C too, and
Python never complained about it.

-- 
If I have been able to see further, it was only because I stood
on the shoulders of giants.  -- Isaac Newton

Roel Schroeven

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
John Fouhy wrote at 14:47 10/10/2005:

>Some comments:
>
>
>textAsString = input.read()
>
>S = ""
>for c in textAsString:
> if c == "\n":
> S += ' '
> else:
> S += c
>
>
>You could write this more concisely as:
>
>S = textAsString.replace('\n', ' ')

Yes! Thanks. That should have occurred to me.

>
># At this point, each element ("word" in code below) of L is
># a string containing a real word such as "dog",
># where "dog" may be prefixed and/or suffixed by strings of
># non-alphanumeric characters. So, for example, word could be "'dog?!".
># The following code first strips these prefixed or suffixed 
>non-alphanumeric
># characters and then finds any words with dashes ("--") or forward
>slashes ("/"),
># such as in "and/or". These then become 2 or more words without the
># dashes or slashes.
>
>
>What about using regular expressions?
>
>re.sub('\W+', ' ') will replace all non-alphanumeric characters with a
>single ' '.  By the looks of things, the only difference is that if
>you had something like 'foo.bar' or 'foo&bar', your code would leave
>that as one word, whereas using the regex would convert it into two
>words.

Well, I'll have to learn the re module first. But I will.

>If you want to keep the meaning of your code intact, you could still
>use a regex to do it.  Something like (untested)
>re.sub('\b\W+|\W+\b|-+|/+', ' ') might work.
>
>
># Remove all empty elements of L, if any
>while "" in L:
> L.remove("")
>
>for e in saveRemovedForLaterL:
> L.append(e)
>
>F = []
>
>for word in L:
> k = L.count(word)
> if (k,word) not in F:
> F.append((k,word))
>
>
>There are a lot of hidden loops in here:
>
>1. '' in L
>This will look at every element of L, until it finds "" or it gets to 
>the end.
>2. L.count(word)
>This will also look at every element of L.
>
>If you combine your loops into one, you should be able to save a lot of 
>time.
>
>eg:
>
>for e in saveRemovedForLaterL:
> L.append(e)
>
>counts = {}
>for word in L:
> if not word:  # This skips empty words.
> continue
> try:
> counts[word] += 1
> except KeyError:
> counts[word] = 1
>F = [(count, word) for word, count in counts.iteritems()]

Things there I don't understand yet, I'm afraid. But I'll get to them.

Thanks for pushing me, John.

Dick


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
Kent Johnson wrote at 15:25 10/10/2005:
>Dick Moores wrote:
>  > Specifically, I'm hoping for comments on or help with:
> > 1) How to debug. I'm using v2.4, IDLE on Win XP.
>
>I have been using winpdb recently, it is a pretty decent standalone 
>Python debugger for Windows. Start it with the -t switch so you don't 
>need the Python Cryptographic Toolkit.
>http://www.digitalpeers.com/pythondebugger/
>
>Kent

Thanks, Kent. I'm looking forward to trying this out.

Dick


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
Danny Yoo wrote at 15:47 10/10/2005:

> > 3) No modularization. Couldn't see a reason to do so. Is there one or
> > two? Specifically, what sections should become modules, if any?
>
>
>Hi Dick,
>
>My recommendation is to practice using and writing functions.

I certainly will, Danny. You've convinced me. And thanks very much for 
the detailed examples.

Dick

>   For
>example, this block here is a good candidate to box off as a named
>function:
>
>###
>F = []
>for word in L:
> k = L.count(word)
> if (k,word) not in F:
> F.append((k,word))
># F is a list of duples (k, word), where k is the frequency of word
>F.sort()
>###
>
>
>That block can be treated as a function that takes in a list of words, and
>returns a list of duples.
>
>##
>def doFrequencyCalculation(L):
> # F is a list of duples (k, word), where k is the frequency of word
> F = []
> for word in L:
> k = L.count(word)
> if (k,word) not in F:
> F.append((k,word))
> F.sort()
> return F
>##
>
>Here, we explicitly say that frequence calculation depends on having a
>list L of words, and that the result should be a list of duples.
>
>Functions allow us to "scope" variables into different kinds of roles.
>We'd say in doFrequenceCalculation that 'word' and 'k' are "temporary"
>variables.  If we write programs using functions, we can express that
>"temporary"  role as a real part of the program.
>
>As it stands, in the original code, we have to watch that 'word' or 'k'
>isn't being used by anyone else: like all the other variables in your
>program, they're all global and given equal status.  You're using
>lowercase as an informal way to remind yourself that those variables are
>meant to be temporary, but that approach can be error prone.
>
>
>Conceptually, your program appears to work in four stages:
>
>  1.  Read lines of a file
>  2.  Extract list of words from the lines
>  3.  Do histogram frequency count on those lines
>  4.  Report frequency results
>
>and it would be great if we could see this program flow as an explicit
>sequence of function calls:
>
>##
>## Pseudocode
>def program():
> L = readLinesInFile()
> W = extractWordsFromLines(L)
> F = doFrequencyCalculation(W)
> reportFrequencyResults(F)
>##
>
>
>This allows one to see the overall program plan without having to read the
>whole program.  It also allows you to do some experimentation later on by
>letting you plugging in different functions in the overall program() plan.
>
>As a concrete example: once we have a program in this form, it's easy to
>swap in a different algorithm for doing frequency calculation:
>
>##
>def program():
> L = readLinesInFile()
> W = extractWordsFromLines(L)
> ## F = doFrequencyCalculation(W)
> F = doFrequencyCalculationWithDifferentAlgorithm(W)
> reportFrequencyResults(F)
>##
>
>
>Does this make sense?

Sure does.

>  Please feel free to ask more questions about this.

None right now, thanks. But I can see I need to study the difference 
between local and global variables.

Dick


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Tutor Digest, Vol 20, Issue 37

2005-10-11 Thread Michael Lange
On Mon, 10 Oct 2005 17:07:47 -0600
Joseph Quigley <[EMAIL PROTECTED]> wrote:


> Hi,
> 
> Unfortunately I get this error:
> 
> Exception in Tkinter callback
> Traceback (most recent call last):
> File "/usr/lib/python2.3/lib-tk/Tkinter.py", line 1345, in __call__
> return self.func(*args)
> File "Programming/Gacor/tmp.py", line 83, in newPic
> imgPrep.configure(file=os.path.join(imgDir, pics[pic]))
> AttributeError: PhotoImage instance has no attribute 'configure'
> 
> I think I should mention that I'm completely in the dark when it comes 
> to GUI programming (and I don't have much time to learn it... or do I?)

Oops, sorry for that; in fact it looks like you will have to define a new 
PhotoImage instance, like:

def newPic():
global pic, imgPrep
pic = pic + 1
imgPrep = ImageTk.PhotoImage(file=file=os.path.join(imgDir, pics[pic]))
imgShow.configure(image=imgPrep)

The configure() method works for Tkinter.PhotoImage, but obviously not for 
ImageTk.PhotoImage, so I got trapped here.

Regards

Michael
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Please look at my wordFrequency.py (Alan Darwish)

2005-10-11 Thread iPlanetLDAP
Exceptional Python Ladies and Gents

How about to make this great wordFrequency.py script
to support Unicode/Locale wordFrequency.py for example
this is an Arabic code page 1256 windows 

I guess it is a matter of character encoding. Try
execute the following command before running
wordFrequency.py (in the same DOS console):
chcp 1256

Also this a sample from some other code sent to me for
anoher script to support Arabic written by Lion (a
close friend who was the first to arabize Linux os) 

import sys
import codecs
import re

TYPE_MULTILINE=0x1

class ReaderBase:
def __init__(self):
self.callbacks={}

def addCommand(self,command,callback,flags):

   
self.callbacks[command]={'callback':callback,'command':command,'flags':flags}

def findCallback(self,command):
try:
return self.callbacks[command[0]]
except KeyError:
return None

def processFile(self,path):
#f=open(path, 'r')
f = codecs.open(path,encoding='CP1256',  mode='r')


This is arabic numbers 1234567890

This is arabic (letters) Text let us add it to the
English text and process wordFrequency.py


|B الأدب
المفرد
للبخاري|A  
البخاري dod
### |V 2-3
لعن الله
من سرق
منار
الأرض
لعن الله
من لعن
والديه
لعن الله
من آوى
محدثا
باب يبر
والديه
ما لم يكن
معصية
حدثنا
محمد بن
عبد
العزيز
قال
حدثنا
عبد
الملك بن
الخطاب
بن عبيد
الله بن
أبى بكرة
البصري
لقيته
بالرملة
قال
حدثني
راشد أبو
محمد عن
شهر بن
حوشب عن
أم
الدرداء
عن أبى
الدرداء
قال
أوصانى
رسول
الله
بتسع لا
تشرك
بالله
شيئا وإن
قطعت أو
حرقت ولا
تتركن
الصلاة
المكتوبة
متعمدا
ومن
تركها
متعمدا
برئت منه
الذمة
ولا
تشربن
الخمر
فإنها
مفتاح كل
شر وأطع
والديك
وإن
أمراك أن
تخرج من
دنياك
فاخرج
لهما ولا
تنازعن
ولاة
الأمر
وإن رأيت
أنك أنت
ولا تفرر
من الزحف
وإن هلكت
وفر
أصحابك
وأنفق من
طولك على
أهلك ولا
ترفع
عصاك على
أهلك
وأخفهم
في
|B الأدب
المفرد
للبخاري|A  
البخاري dod
### |V 3-3
وصاحبهما
في
الدنيا
معروفا
والثانية
إني كنت
أخذت
سيفا
أعجبنى
فقلت يا
رسول
الله هب
لي هذا
فنزلت
يسألونك
عن
الأنفال
والثالثة
إني مرضت
فأتاني
رسول
الله
فقلت يا
رسول
الله إني
أريد أن
أقسم
مالي
أفأوصى
بالنصف
فقال لا
فقلت
الثلث
فسكت
فكان
الثلث
بعده
جائزا
والرابعة
إني شربت
الخمر مع
قوم من
الأنصار
فضرب رجل
منهم
أنفى
بلحيى
جمل
فأتيت
النبي
فأنزل
تحريم
الخمر
حدثنا
الحميدي
قال
حدثنا بن
عيينة
قال
حدثنا
هشام بن
عروة قال
أخبرني
أبى قال
أخبرتنى

|B الأدب
المفرد
للبخاري|A  
البخاري dod
### |V 3-3 هذا
فنزلت
يسألونك
عن
الأنفال
والثالثة
إني مرضت
فأتاني
رسول
الله
فقلت يا
رسول
الله إني
أريد أن
أقسم
مالي
أفأوصى
بالنصف
فقال لا
فقلت
الثلث
فسكت
فكان
الثلث
بعده
جائزا
والرابعة
إني شربت
الخمر مع
قوم من
الأنصار
فضرب رجل
منهم
أنفى
بلحيى
جمل
فأتيت
النبي
فأنزل
تحريم
الخمر
حدثنا
الحميدي
قال
حدثنا بن
عيينة
قال
حدثنا
هشام بن
عروة قال
أخبرني
أبى قال
أخبرتنى

Have fun
Alan Darwish
www.Muhammad.com

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Walk a dictionary recursively

2005-10-11 Thread Alan Gauld
> First of all I'd like to know if it is normal to use so complex data
> structures in which store data, or if it possible in some way to
> organize them using smaller "organizational units".

Its not uncommon but often it can be simplified by using classes.
In particular the classes can expose an interface that makes the 
structures appear simpler by providing the access mechanisms 
needed so that the outer code is ignorant of the structural complexity.

> to decode each contained element. Is there a way to walk recursively
> the dictionary, or should I write my own walk function?

Not for an arbitrarily complex structure such as you have defined. 
You will need to write your own I fear.

HTH

Alan G
Author of the learn to program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Walk a dictionary recursively

2005-10-11 Thread paul brian
Firstly are you using this to store or alter data regarding Microsoft
Active Directory?. If so I suggest you look at some of their ADSI /
WMI interfaces for COM (if you use win32com from Mark Hammond or
activeState life becomes a lot easier. Well the Windows programming
part of it does)

As for the particulars of your question, you might find life simpler
if you created a "parentID" (and/or childID) for each unique entry in
the tree.

As you are going to be growing the data list in size one other area to
look at is generators - this will enable  you to walk arbitrarily
large trees but with a far lower memory footprint and hence a lot
faster.  A generator class returns an object that will walk through an
iteration set (like a for loop) but at the end of every step will
"disappear"  from the stack and when it is called again it starts
exactly where it left off.

So I would suggest you create generaotr based classes to store your
data, using an explicit parent/child relationship rather than relying
on the implicit relationships of which dictionary is stored inside
which dictionary.

It is still a far chunk of work.  I suggest you start on the parent
child thing first.
Think about having a single point of entry that creates a new object
and then "hangs" it on the tree.

I hope that helps and do please come back to the list with how you are
gettng on.



On 10/11/05, Negroup - <[EMAIL PROTECTED]> wrote:
> Hi tutors, in my application I found convenient to store all the data
> in a data structure based on a dictionary containing a lot of keys,
> and each of them host other dictionary with lists and dictionaries
> nested inside and so on...
>
> First of all I'd like to know if it is normal to use so complex data
> structures in which store data, or if it possible in some way to
> organize them using smaller "organizational units".
>
> This is instead the problem I should solve as soon as possible: I
> should apply a function, exactly the string's method decode('utf-8'),
> to  each key and value of the above descripted dictionary. Consider
> that the keys are integers or strings, and if a key is a list, I need
> to decode each contained element. Is there a way to walk recursively
> the dictionary, or should I write my own walk function?
>
> Thanks in advance,
> negroup
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>


--
--
Paul Brian
m. 07875 074 534
t. 0208 352 1741
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Walk a dictionary recursively

2005-10-11 Thread Negroup -
2005/10/11, paul brian <[EMAIL PROTECTED]>:
> Firstly are you using this to store or alter data regarding Microsoft
> Active Directory?. If so I suggest you look at some of their ADSI /
> WMI interfaces for COM (if you use win32com from Mark Hammond or
> activeState life becomes a lot easier. Well the Windows programming
> part of it does)

Hi Paul, this is not my particular case, but your answer still has
been precious!

Really dictionaries might result in a poor solution in some cases. I
think that I will face my problem using a dedicated class (thanks Alan
for this suggestion), implementing an "expandable" tree structure
where items will be related via parent/child relation. The idea of
using generator is whetting too.

Actually I have no time (hrm.. call it skills?) to make up all these
things togheter! However, considering that in the near future I shall
rewrite this application for a bunch of good reasons, I'll ask for
help again, count on it ;-)

Thanks.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Kent Johnson
Dick Moores wrote:
> (Execution took about 30 sec. with my computer.)

That's way too long
> 
> Specifically, I'm hoping for comments on or help with:
> 2) I've tried to put in remarks that will help most anyone to understand 
> what the code is doing. Have I succeeded?

Yes, i think so

> 3) No modularization. Couldn't see a reason to do so. Is there one or two?
> Specifically, what sections should become modules, if any?

As Danny says, breaking it up into functions makes it easier to understand and 
test

> 4) Variable names. I gave up on making them self-explanatory. Instead, I 
> put in some remarks near the top of the script (lines 6-10) that I hope 
> do the job. Do they? In the code, does the "L to newL to L to newL to L" 
> kind of thing remain puzzling?

Some of your variables seem unnecessary. For example
newWord = word.strip(chars)
word = newWord
could be just
word = word.strip(chars)

> 5) Ideally, abbreviations that end in a period, such as U.N., e.g., i.e., 
> viz. op. cit., Mr. (Am. E.), etc., should not be stripped of their final 
> periods (whereas other words that end a sentence SHOULD be stripped). I 
> tried making and using a Python list of these, but it was too tough to 
> write the code to use it. Any ideas?

You should be able to do this with regular expressions or searching in the 
word. You want to test for a word that ends with a period but doesn't include 
any periods. Somenthing like
if word.endswith('.') and '.' not in word[:-1]:
  word = word[:-1]

Other notes:
Use re.split() to do all the splits at once. Something like
  L = re.split(r'\s+|--|/', textAsString)

#remove empty elements in L
while "" in L:
L.remove("")
The above iterates L twice for each empty word! The remove() calls are 
expensive too because the remaining elements of L must be shifted down. Do the 
whole thing in one pass over L with
L = [ w for w in L if w ]

You only need to remove empty elements once, when the rest of the processing is 
done.

for e in saveRemovedForLaterL:
L.append(e)
could be
L.extend(e)


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Where is the error

2005-10-11 Thread enas khalil


hello,

when i run the following code to Read and tokenize data from a tagged text as follows :
from nltk.corpus import brownfrom nltk.tagger import TaggedTokenizerfrom nltk.tokenizer import *tagged_txt_str=open('corpus.txt' ).read()tagged_txt_token=Token(TEXT=tagged_txt_str)TaggedTokenizer.tokenize(tagged_txt_token)print tagged_txt_token
 
i got the following error :Traceback (most recent call last):File "C:\My Documents\TAGGING.PY", line 3, in -toplevel-from nltk.tagger import TaggedTokenizerImportError: cannot import name TaggedTokenizer
 
 
 
could anyone help me thanks in advanceenas
		 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Where is the error

2005-10-11 Thread Kent Johnson
What version of NLTK are you using? From a look at the API docs, 
nltk.tagger.TaggedTokenizer seems to have been removed in v1.4.

Kent

enas khalil wrote:
> hello,
> 
> 
> when i run the following code to Read and tokenize data from a
> tagged text as follows :
> 
> 
> from nltk.corpus import brown
> from nltk.tagger import TaggedTokenizer
> from nltk.tokenizer import *
> tagged_txt_str=open('corpus.txt' ).read()
> tagged_txt_token=Token(TEXT=tagged_txt_str)
> TaggedTokenizer.tokenize(tagged_txt_token)
> print tagged_txt_token
> 
>  
> 
> 
> i got the following error :
> Traceback (most recent call last):
> File "C:\My Documents\TAGGING.PY", line 3, in -toplevel-
> from nltk.tagger import TaggedTokenizer
> ImportError: cannot import name TaggedTokenizer
> 
>  
> 
>  
> 
>  
> 
> could anyone help me
> 
> thanks in advance
> 
> enas
> 
> 
> Yahoo! Music Unlimited - Access over 1 million songs. Try it free. 
> 
>  
> 
> 
> 
> 
> 
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] AttributeError: 'str' object has no attribute, 'geturl'

2005-10-11 Thread Joseph Quigley
>  You are welcome. This time I would like to help you but the code is 
> incomplete
> (import error for Image) and I have never used urllib2 so I don't really know 
> what to 
> do.
>  Again, try to debug it with pdb. Place "import pdb; pdb.set_trace()" where 
> you want 
> the break point and see what's going on.
>
> Javier

Hi,

I'm embarased to say that I was trying to download an image that doesn't exist 
(I had the url wrong).
Thanks for hte tip on pdb,
Joe

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Where is the error

2005-10-11 Thread Alberto Troiano








Hi

from nltk.tagger import TaggedTokenizer

 

I’m not familiar with this module but looking at the error it
looks like the TaggedTokenizer module maybe misspelled or does not exists or
the same with tagger

Maybe an upcase letter or something like that

Check your library for the correct name... If it’s correct then I’m
screwed *grin*. Perhaps other
tutor 

 

Best Regards,

 

Alberto

 

 









De:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] En nombre de enas khalil
Enviado el: Martes, 11 de Octubre
de 2005 08:28
Para: python-list@python.org;
tutor@python.org
Asunto: [Tutor] Where is the error



 







hello,


when i run the following code to Read and tokenize data from a tagged text as
follows :


from nltk.corpus import brown
from nltk.tagger import TaggedTokenizer
from nltk.tokenizer import *
tagged_txt_str=open('corpus.txt' ).read()
tagged_txt_token=Token(TEXT=tagged_txt_str)
TaggedTokenizer.tokenize(tagged_txt_token)
print tagged_txt_token

 


i got the following error :
Traceback (most recent call last):
File "C:\My Documents\TAGGING.PY", line 3, in -toplevel-
from nltk.tagger import TaggedTokenizer
ImportError: cannot import name TaggedTokenizer

 

 

 

could anyone help me 

thanks in advance

enas













Yahoo!
Music Unlimited - Access over 1 million songs. Try it free.






___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
Kent Johnson wrote at 03:24 10/11/2005:
>Dick Moores wrote:
> > (Execution took about 30 sec. with my computer.)
>
>That's way too long

How long would you expect? I've already made some changes but haven't 
seen the time change much.

> >
> > Specifically, I'm hoping for comments on or help with:
> > 2) I've tried to put in remarks that will help most anyone to understand
> > what the code is doing. Have I succeeded?
>
>Yes, i think so

Good.

> > 3) No modularization. Couldn't see a reason to do so. Is there one or 
> two?
> > Specifically, what sections should become modules, if any?
>
>As Danny says, breaking it up into functions makes it easier to 
>understand and test

OK.

> > 4) Variable names. I gave up on making them self-explanatory. Instead, I
> > put in some remarks near the top of the script (lines 6-10) that I hope
> > do the job. Do they? In the code, does the "L to newL to L to newL to L"
> > kind of thing remain puzzling?
>
>Some of your variables seem unnecessary. For example
> newWord = word.strip(chars)
> word = newWord
>could be just
> word = word.strip(chars)

Yes, I'll have to get that kind of thing straightened out. In my mind, 
first of all.


> > 5) Ideally, abbreviations that end in a period, such as U.N., e.g., 
> i.e.,
> > viz. op. cit., Mr. (Am. E.), etc., should not be stripped of their final
> > periods (whereas other words that end a sentence SHOULD be stripped). I
> > tried making and using a Python list of these, but it was too tough to
> > write the code to use it. Any ideas?
>
>You should be able to do this with regular expressions or searching in 
>the word. You want to test for a word that ends with a period but 
>doesn't include any periods. Something like
>if word.endswith('.') and '.' not in word[:-1]:
>   word = word[:-1]

Nice! That takes care of U.N., e.g., i.e., but not viz., op. cit., or Mr.

>Other notes:
>Use re.split() to do all the splits at once. Something like
>   L = re.split(r'\s+|--|/', textAsString)

Don't understand this yet. I'll work on it.

>#remove empty elements in L
>while "" in L:
> L.remove("")
>The above iterates L twice for each empty word!

I don't get the twice. Could you spell it out, please?

>The remove() calls are expensive too because the remaining elements of L 
>must be shifted down. Do the whole thing in one pass over L with
> L = [ w for w in L if w ]
>You only need to remove empty elements once, when the rest of the 
>processing is done.

Got it. But using this doesn't seem to make much difference in the time.

Also, I'm puzzled that whether or not psyco is employed makes no 
difference in the time. Can you explain why?

>for e in saveRemovedForLaterL:
> L.append(e)
>could be
>L.extend(e)

Are you recommending L.extend(e), or is it just another way to do it?

Thanks very much for your help, Kent.

Dick 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Kent Johnson
Dick Moores wrote:
> Kent Johnson wrote at 03:24 10/11/2005:
> 
>>Dick Moores wrote:
>>
>>>(Execution took about 30 sec. with my computer.)
>>
>>That's way too long
> 
> 
> How long would you expect? I've already made some changes but haven't 
> seen the time change much.

A couple of seconds at most, unless you are running it on some dog computer. 
It's just not that much text and you should be able to process it in a couple 
of passes at most.

What changes have you made? Several changes already posted should have a 
noticable effect, I think. What is your current code?

>>>5) Ideally, abbreviations that end in a period, such as U.N., e.g., 
>>
>>i.e.,
>>
>>>viz. op. cit., Mr. (Am. E.), etc., should not be stripped of their final
>>>periods (whereas other words that end a sentence SHOULD be stripped). I
>>>tried making and using a Python list of these, but it was too tough to
>>>write the code to use it. Any ideas?
>>
>>You should be able to do this with regular expressions or searching in 
>>the word. You want to test for a word that ends with a period but 
>>doesn't include any periods. Something like
>>if word.endswith('.') and '.' not in word[:-1]:
>>  word = word[:-1]
> 
> 
> Nice! That takes care of U.N., e.g., i.e., but not viz., op. cit., or Mr.

Ah, right. I don't know how you could handle that except with a dictionary. At 
least they will only appear in the word list once, without the trailing period.

>>Other notes:
>>Use re.split() to do all the splits at once. Something like
>>  L = re.split(r'\s+|--|/', textAsString)
> 
> 
> Don't understand this yet. I'll work on it.

OK, it's a regular expression that will match either
 \s+ one or more white space e.g. space, tab, newline
 -- a hyphen
 / a slash

re.split() then splits the string on each match.
> 
> 
>>#remove empty elements in L
>>while "" in L:
>>L.remove("")
>>The above iterates L twice for each empty word!
> 
> 
> I don't get the twice. Could you spell it out, please?

the test /"" in L/ searches the list for an empty string - that's one
L.remove("") searches the list again for the empty string, then removes it
> 
> 
>>The remove() calls are expensive too because the remaining elements of L 
>>must be shifted down. Do the whole thing in one pass over L with
>>L = [ w for w in L if w ]
>>You only need to remove empty elements once, when the rest of the 
>>processing is done.
> 
> 
> Got it. But using this doesn't seem to make much difference in the time.
> 
> Also, I'm puzzled that whether or not psyco is employed makes no 
> difference in the time. Can you explain why?

My guess is it's because you have so many O(n^2) elements in the code. You have 
to get your algorithm to be O(n).

> 
> 
>>for e in saveRemovedForLaterL:
>>L.append(e)
>>could be
>>L.extend(e)
> 
> 
> Are you recommending L.extend(e), or is it just another way to do it?

Recommending. Look for ways to eliminate loops. If you can't eliminate them, 
move them into C code in the runtime, which is what this one does.

> 
> Thanks very much for your help, Kent.

No problem!

Kent
> 
> Dick 
> 
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
> 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Kent Johnson
Kent Johnson wrote:
> Dick Moores wrote:
> 
>> Kent Johnson wrote at 03:24 10/11/2005:
>>
>>> Dick Moores wrote:
>>>
 (Execution took about 30 sec. with my computer.)
>>>
>>>
>>> That's way too long
>>
>>
>>
>> How long would you expect? I've already made some changes but haven't 
>> seen the time change much.
> 
> 
> A couple of seconds at most, unless you are running it on some dog 
> computer. It's just not that much text and you should be able to process 
> it in a couple of passes at most.

OK I couldn't resist. I took your program and ran it on my computer - took 
about 38 seconds and got the same results as you. Then I made the changes I 
outlined, and a few other similar ones, and got it down to 34 secs. Finally I 
made the change suggested by John Fouhy - to accumulate the counts in a dict - 
and the time went down to 0.23 seconds.

>> Also, I'm puzzled that whether or not psyco is employed makes no 
>> difference in the time. Can you explain why?
> 
> 
> My guess is it's because you have so many O(n^2) elements in the code. 
> You have to get your algorithm to be O(n).

In particular this code:
for word in L:
k = L.count(word)
if (k,word) not in F:
F.append((k,word))

L.count() has to scan through the entire list (L) looking for a match with each 
word. So for each word, you are making 26140 string compares. The total number 
of compares is 26140*26140 or 683,299,600. That's a lot!
Then, for each word, you scan F for a match. Now you are doing tuple compares. 
The number of compares will increase as the length of F, but overall it will be 
about 26140*3700/2 or 48,359,000 compares.

Compare this to the dictionary version which just iterates L once, doing a 
dictionary lookup and write for each word.

The reason psyco doesn't make much difference is because all the time is spent 
in list.count() which is already C code.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] code improvement for beginner ? (fwd)

2005-10-11 Thread Danny Yoo


-- Forwarded message --
Date: Tue, 11 Oct 2005 20:55:24 +0200
From: lmac <[EMAIL PROTECTED]>
To: Danny Yoo <[EMAIL PROTECTED]>
Subject: Re:code improvement for beginner ?

Danny Yoo wrote:
>
>>>The point of this restructuring is to allow you to add more image
>>>types without too much pain, since there's no more hardcoded array
>>>indexing against r1.  It also simplifies to calls to imgreg from:
>>>
>>>if imgreg(r1[0],a) == 1:
>>>continue
>>>if imgreg(r1[1],a) == 1:
>>>continue
>>>imgreg(r1[2],a)
>>>
>>>to the simpler:
>>>
>>>imgreg(a)
>>
>>Yes. Thats good.
>
>
>
> Check your code: pageimgs() is calling imgreg three times.
>
>
>
>
>>The problem with downloading the images is this:
>>
>>-
>>http://images.nfl.com/images/globalnav-shadow-gray.gif
>>Traceback (most recent call last):
>>  File "/home/internet/bin/nflgrab.py", line 167, in ?
>>urllib.urlretrieve(img,img[f:])
>>  File "/usr/lib/python2.3/urllib.py", line 83, in urlretrieve
>>return _urlopener.retrieve(url, filename, reporthook, data)
>>  File "/usr/lib/python2.3/urllib.py", line 216, in retrieve
>>tfp = open(filename, 'wb')
>>IOError: [Errno 13] Permission denied: '/globalnav-shadow-gray.gif'
>
>
> One bug is that Python is trying to write those image files to the root
> directory.  Your operating system's file system is saying that it won't
> allow you to write files to that location.
>
> urllib.urlretrieve saves those files with the path given in the second
> parameter:
>
> urllib.urlretrieve(img, img[f:])
> ^^^
>
> You may want to change this so that it stores those files in a particular
> directory.  Something like:
>
> urllib.urlretrieve(img, os.path.join("/tmp",
>  img[f+1:]))
>
> may work better.  The main idea is that you should explicitly control
> where the files are being downloaded to.
>
>
> Another bug: you will probably still run into problems because 'img' must
> be a URL, and each 'img' is instead a line that contains a URL.  The
> difference is between having a string like:
>
> http://python.org.
>
> and:
>
> This line contains a url to the python web site: http://python.org.
>
> and images, as far as I can tell, is storing a list of the lines, not the
> image urls.  So you may want to make appropriate changes to imgreg() so
> that it maintains a list of image urls.
>
>

Ok. Version 3. Images are now downloaded
Thank you. This was a double-bug ;-)
I added a second argument for an optional target-directory.
And of course added a line about your help.

Thanks.


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Dick Moores
Kent Johnson wrote at 10:37 10/11/2005:
>Kent Johnson wrote:
> > Dick Moores wrote:
> >
> >> Kent Johnson wrote at 03:24 10/11/2005:
> >>
> >>> Dick Moores wrote:
> >>>
>  (Execution took about 30 sec. with my computer.)
> >>>
> >>>
> >>> That's way too long
> >>
> >>
> >>
> >> How long would you expect? I've already made some changes but haven't
> >> seen the time change much.
> >
> >
> > A couple of seconds at most, unless you are running it on some dog
> > computer. It's just not that much text and you should be able to process
> > it in a couple of passes at most.
>
>OK I couldn't resist. I took your program and ran it on my computer - 
>took about 38 seconds and got the same results as you. Then I made the 
>changes I outlined, and a few other similar ones, and got it down to 34 secs.

Yes, that's about the difference I was seeing. Thanks for taking the 
trouble. I went from 30 to 27. With no regex use (don't understand it yet).

>  Finally I made the change suggested by John Fouhy - to accumulate the 
> counts in a dict - and the time went down to 0.23 seconds.

WOW! I didn't implement John's change because I didn't understand it. 
Haven't dealt with dictionaries yet.

> >> Also, I'm puzzled that whether or not psyco is employed makes no
> >> difference in the time. Can you explain why?
> >
> >
> > My guess is it's because you have so many O(n^2) elements in the code.
> > You have to get your algorithm to be O(n).

OK, I finally bit the bullet, googled O(n^2), and read about Big O 
notation at  and 

I think I've now got at least the basic idea.

>In particular this code:
>for word in L:
> k = L.count(word)
> if (k,word) not in F:
> F.append((k,word))
>
>L.count() has to scan through the entire list (L) looking for a match 
>with each word. So for each word, you are making 26140 string compares. 
>The total number of compares is 26140*26140 or 683,299,600. That's a lot!
>Then, for each word, you scan F for a match. Now you are doing tuple 
>compares. The number of compares will increase as the length of F, but 
>overall it will be about 26140*3700/2 or 48,359,000 compares.

Kent, you're beginning to get thru to me. Thanks for the details and the 
numbers.

>Compare this to the dictionary version which just iterates L once, doing 
>a dictionary lookup and write for each word.
>
>The reason psyco doesn't make much difference is because all the time is 
>spent in list.count() which is already C code.

Ah. But how can I know what is in C code and what isn't? For example, in 
a previous post you say that L.extend(e) is in C, and imply that 
L.append(e) isn't, and that therefore L.extend(e) should be used.

Well, back to Hetlands' Beginning Python.

Dick  

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Kent Johnson
Dick Moores wrote:
> Kent Johnson wrote at 10:37 10/11/2005:
>>The reason psyco doesn't make much difference is because all the time is 
>>spent in list.count() which is already C code.
> 
> 
> Ah. But how can I know what is in C code and what isn't? For example, in 
> a previous post you say that L.extend(e) is in C, and imply that 
> L.append(e) isn't, and that therefore L.extend(e) should be used.

In general, if you wrote it, it's in Python. If it is built-in - either as part 
of the Python syntax or pretty much anything in chapter 2 of the Python Library 
Reference - it's in C. If it is in the standard lib (anything you import) you 
have to look at the lib module to see how it is implemented - some are in 
Python, some are in C. Both extend() and append() are part of the built-in type 
'list' so they are in C. The difference is that to use append() you have to put 
a loop around it, the loop is in your Python code. If you use extend() the loop 
is implicit (in extend()) and implemented as part of extend().

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Andrew P
Just want to add a little something here, because reading over this thread, I think there may have been some confusion: 

Kent wrote:

for e in saveRemovedForLaterL:    L.append(e)could be
L.extend(e)

I think he might have meant:

for e in saveRemovedForLaterL:    L.append(e)could be

L.extend(saveRemovedForLaterL)

The difference between these is that one is explicitly looping with
Python, accessing each element of the first list one at a time,
appending it to the other one at a time.  Whereas if you call
extend() instead, it will be doing that looping for you with the
extend() method, written in C, and very quickly indeed.  I would
worry less about what is written in C vs what is built-in, and does
implicit looping for you.  Any time you can avoid stepping over a
list one element at a time, is usually the faster way.

>>> lst = [1,2,3,4]
>>> lst2 = [5,6,7,8]
>>> lst.extend(lst2)
>>> print lst
[1, 2, 3, 4, 5, 6, 7, 8]

Compare this to:

>>> lst = [1,2,3,4]
>>> lst2 = [5,6,7,8]
>>> for num in lst2:
...     lst.append(num)    
>>> print lst
[1, 2, 3, 4, 5, 6, 7, 8]


FWIW, append is generally used to generate a list from something calculated on the fly:

>>> lst = []
>>> for i in range(10):
...     lst.append(i)
...     
>>> print lst
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Ah. But how can I know what is in C code and what isn't? For example, ina previous post you say that 
L.extend(e) is in C, and imply thatL.append(e) isn't, and that therefore L.extend(e) should be used.Well, back to Hetlands' Beginning Python.Dick


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Please look at my wordFrequency.py

2005-10-11 Thread Kent Johnson
Andrew P wrote:
> Just want to add a little something here, because reading over this 
> thread, I think there may have been some confusion:
> 
> Kent wrote:
> 
> for e in saveRemovedForLaterL:
>L.append(e)
> could be
> L.extend(e)
> 
> I think he might have meant:
> 
> for e in saveRemovedForLaterL:
>L.append(e)
> could be
> L.extend(saveRemovedForLaterL)

Right you are, thanks for the catch!

Kent

> 
> The difference between these is that one is explicitly looping with 
> Python, accessing each element of the first list one at a time, 
> appending it to the other one at a time.  Whereas if you call extend() 
> instead, it will be doing that looping for you with the extend() method, 
> written in C, and very quickly indeed.  I would worry less about what is 
> written in C vs what is built-in, and does implicit looping for you.  
> Any time you can avoid stepping over a list one element at a time, is 
> usually the faster way.
> 
>  >>> lst = [1,2,3,4]
>  >>> lst2 = [5,6,7,8]
>  >>> lst.extend(lst2)
>  >>> print lst
> [1, 2, 3, 4, 5, 6, 7, 8]
> 
> Compare this to:
> 
>  >>> lst = [1,2,3,4]
>  >>> lst2 = [5,6,7,8]
>  >>> for num in lst2:
> ... lst.append(num)  
>  >>> print lst
> [1, 2, 3, 4, 5, 6, 7, 8]
> 
> 
> FWIW, append is generally used to generate a list from something 
> calculated on the fly:
> 
>  >>> lst = []
>  >>> for i in range(10):
> ... lst.append(i)
> ...
>  >>> print lst
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
> 
> 
> 
> Ah. But how can I know what is in C code and what isn't? For example, in
> a previous post you say that L.extend(e) is in C, and imply that
> L.append(e) isn't, and that therefore L.extend(e) should be used.
> 
> Well, back to Hetlands' Beginning Python.
> 
> Dick
> 
> 
> 
> 
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] code improvement for beginner ?

2005-10-11 Thread Scott Oertel
lmac wrote:

> ---
>
>The problem with downloading the images is this:
>
>-
>http://images.nfl.com/images/globalnav-shadow-gray.gif
>Traceback (most recent call last):
>  File "/home/internet/bin/nflgrab.py", line 167, in ?
>urllib.urlretrieve(img,img[f:])
>  File "/usr/lib/python2.3/urllib.py", line 83, in urlretrieve
>return _urlopener.retrieve(url, filename, reporthook, data)
>  File "/usr/lib/python2.3/urllib.py", line 216, in retrieve
>tfp = open(filename, 'wb')
>IOError: [Errno 13] Permission denied: '/globalnav-shadow-gray.gif'
>-
>
>Is there any solution to know if i can download the image ?
>
>Thanks.
>
>
>
>___
>Tutor maillist  -  Tutor@python.org
>http://mail.python.org/mailman/listinfo/tutor
>
>
>
>  
>
I used this in my previous program while having problems writing files.

from sys import argv
from os.path import dirname, join

try:
   tfp = open(join(dirname(argv[0]), filename), 'wb')
except IOError:
   print "Unable to write file, pick another directory"

this will save it into the directory that your program resides in.

you might want to check out this current lib doc, 
http://www.python.org/doc/current/lib/os-file-dir.html
it has a bunch of objects of the os module that are good for testing if 
a directory is writable







___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor