date:20130703

Re: [Tutor] Error in Game

2013-07-03 Thread bob gailer


On 7/2/2013 4:22 PM, Jack Little wrote:

I know the code is correct

As Joel said- how could it be, since you do not get the desired results?

When posting questions tell us:
- what version of Python?
- what operating system?
- what you use to edit (write) your code
- what you do to run your code
copy and paste the execution


, but it doesn't send the player to the shop. Here is the code:

def lvl3_2():
print "You beat level 3!"
print "Congratulations!"
print "You have liberated the Bristol Channel!"
print "[Y] to go to the shop or [N] to advance."
final1=raw_input(">>")
if final1.lower()=="y":
shop2()
elif final1.lower()=="n":
lvl4()
It is a good idea to add an else clause to handle the case where the 
user's entry does not match the if or elif tests.


It is not a good idea to use recursion to navigate a game structure. It 
is better to have each function return to a main program, have the main 
program determine the next step and invoke it.




Help?
Since we are volunteers, the more you tell us the easier it is for us to 
do that.


--
Bob Gailer
919-636-4239
Chapel Hill NC
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] memory consumption

2013-07-03 Thread Steven D'Aprano


On 04/07/13 04:17, Andre' Walker-Loud wrote:

Hi All,

I wrote some code that is running out of memory.



How do you know? What are the symptoms? Do you get an exception? Computer 
crashes? Something else?




It involves a set of three nested loops, manipulating a data file (array) of 
dimension ~ 300 x 256 x 1 x 2.


Is it a data file, or an array? They're different things.



It uses some third party software, but my guess is I am just not aware of how 
to use proper memory management and it is not the 3rd party software that is 
the culprit.


As a general rule, you shouldn't need to worry about such things, at least 99% 
of the time.



Memory management is new to me, and so I am looking for some general guidance.  
I had assumed that reusing a variable name in a loop would automatically flush 
the memory by just overwriting it.  But this is probably wrong.  Below is a 
very generic version of what I am doing.  I hope there is something obvious I 
am doing wrong or not doing which I can to dump the memory in each cycle of the 
innermost loop.  Hopefully, what I have below is meaningful enough, but again, 
I am new to this, so we shall see.


Completely non-meaningful.





# generic code skeleton
# import a class I wrote to utilize the 3rd party software
import my_class


Looking at the context here, "my_class" is a misleading name, since it's 
actually a module, not a class.



# instantiate the function do_stuff
my_func = my_class.do_stuff()


This is getting confusing. Either you've oversimplified your pseudo-code, or 
you're using words in ways that do not agree with standard terminology. Or 
both. You don't instantiate functions, you instantiate a class, which gives you 
an instance (an object), not a function.

So I'm lost here -- I have no idea what my_class is (possibly a module?), or 
do_stuff (possibly a class?) or my_func (possibly an instance?).



# I am manipulating a data array of size ~ 300 x 256 x 1 x 2
data = my_data  # my_data is imported just once and has the size above


Where, and how, is my_data imported from? What is it? You say it is "a data 
array" (what sort of data array?) of size 300x256x1x2 -- that's a four-dimensional 
array, with 153600 entries. What sort of entries? Is that 153600 bytes (about 150K) or 
153600 x 64-bit floats (about 1.3 MB)? Or 153600 data structures, each one holding 1MB of 
data (about 153 GB)?



# instantiate a 3d array of size 20 x 10 x 10 and fill it with all zeros
my_array = numpy.zeros([20,10,10])


At last, we finally see something concrete! A numpy array. Is this the same 
sort of array used above?



# loop over parameters and fill array with desired output
for i in range(loop_1):
 for j in range(loop_2):
 for k in range(loop_3):


How big are loop_1, loop_2, loop_3?

You should consider using xrange() rather than range(). If the number is very 
large, xrange will be more memory efficient.



 # create tmp_data that has a shape which is the same as data 
except the first dimension can range from 1 - 1024 instead of being fixed at 300
 '''  Is the next line where I am causing memory problems? '''
 tmp_data = my_class.chop_data(data,i,j,k)


How can we possibly tell if chop_data is causing memory problems when you don't 
show us what chop_data does?



 my_func(tmp_data)
 my_func.third_party_function()


Again, no idea what they do.



 my_array([i,j,k]) = my_func.results() # this is just a floating 
point number
 ''' should I do something to flush tmp_data? '''


No. Python will automatically garbage collect is as needed.

Well, that's not quite true. It depends on what my_tmp actually is. So, 
*probably* no. But without seeing the code for my_tmp, I cannot be sure.



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] memory consumption

2013-07-03 Thread Dave Angel

On 07/03/2013 02:17 PM, Andre' Walker-Loud wrote:

Hi All,

I wrote some code that is running out of memory.

And you know this how?  What OS are you using, and specifically how is 
it telling you that you've run out of memory?  And while you're at it, 
what version of Python?  And are the OS and Python 32 and 32 bit, or 64 
and 64, or mixed?

 It involves a set of three nested loops, manipulating a data file (array) of 
dimension ~ 300 x 256 x 1 x 2.  It uses some third party software, but my guess 
is I am just not aware of how to use proper memory management and it is not the 
3rd party software that is the culprit.

In particular you're using numpy, and I don't know its quirks.  So I'll 
just speak of Python in general, and let someone else address numpy.

Memory management is new to me, and so I am looking for some general guidance.  
I had assumed that reusing a variable name in a loop would automatically flush 
the memory by just overwriting it.

It could be useful to learn how Python memory is manipulated.  To start 
with, the 'variable' doesn't take a noticeable amount of space.  It's 
the object its bound to that might take up lots of space, directly or 
indirectly.  When you bind a new object to it, you free up the last one, 
unless something else is also bound to it.

By indirectly, I refer to something like a list, which is one object, 
but which generally is bound to dozens or millions of others, and any of 
those may be bound to lots of others.  Unbinding the list will usually 
free up all that stuff.

The other thing that can happen is an object may indirectly be bound to 
itself.  Trivial example:

>>> mylist = [1,2]
>>> mylist.append(mylist)
>>> mylist
[1, 2, [...]]
>>>

Fortunately for us, the repr() display of mylist doesn't descend 
infinitely into the guts of the elements, or it would be still printing 
next week (or until the printing logic ran out of memory).

Anyway, once you have such a binding loop, the simple memory freeing 
logic (refcount) has to defer to the slower and less frequently run gc 
(garbage collection).

 But this is probably wrong.  Below is a very generic version of what I am 
doing.  I hope there is something obvious I am doing wrong or not doing which I 
can to dump the memory in each cycle of the innermost loop.  Hopefully, what I 
have below is meaningful enough, but again, I am new to this, so we shall see.

# generic code skeleton
# import a class I wrote to utilize the 3rd party software
import my_class

# instantiate the function do_stuff
my_func = my_class.do_stuff()

So this is a class-static method which returns a callable object?  One 
with methods of its own?

# I am manipulating a data array of size ~ 300 x 256 x 1 x 2
data = my_data  # my_data is imported just once and has the size above

# instantiate a 3d array of size 20 x 10 x 10 and fill it with all zeros
my_array = numpy.zeros([20,10,10])
# loop over parameters and fill array with desired output
for i in range(loop_1):
 for j in range(loop_2):
 for k in range(loop_3):
 # create tmp_data that has a shape which is the same as data 
except the first dimension can range from 1 - 1024 instead of being fixed at 300

 '''  Is the next line where I am causing memory problems? '''

Hard to tell.  is chop-data() a trivial function you could have posted? 
 It's a class method, not an instance method.  Is it keeping references 
to the data it's returning?  Perhaps for caching purposes?

 tmp_data = my_class.chop_data(data,i,j,k)
 my_func(tmp_data)
 my_func.third_party_function()
 my_array([i,j,k]) = my_func.results() # this is just a floating 
point number

 ''' should I do something to flush tmp_data? '''

You don't show us any code that would cause me to suspect tmp_data.

#

You leave out so much that it's hard to know what parts to ask you to 
post.  if data is a numpy array, and my_class.chop_data is a class 
method, perhaps you could post that class method.

Do you have a tool for your OS that lets you examine memory usage 
dynamically?  If you do, sometimes it's instructive to watch while a 
program is running to see what the dynamics are.

Note that Python, like nearly any other program written with the C 
library, will not necessarily free memory all the way to the OS at any 
particular moment in time.  If you (A C programmer) were to malloc() a 
megabyte block and immediately free it, you might not see the free 
externally, but new allocations would instead be carved out of that 
freed block.

Those specifics vary with OS and with C compiler.  And it may very well 
vary with size of block.  Thus individual blocks over a certain size may 
be allocated directly from the OS, and freed immediately when done, 
while smaller blocks are coalesced in the library and reused over and over

Re: [Tutor] memory consumption

2013-07-03 Thread Alan Gauld


On 03/07/13 19:17, Andre' Walker-Loud wrote:

Your terminology is all kixed up and therefore does not make sense.
WE definitely need to know more about the my_class module and do_stuff



# generic code skeleton
# import a class I wrote to utilize the 3rd party software
import my_class


This import a module which may contain some code that you wrote
but...


# instantiate the function do_stuff
my_func = my_class.do_stuff()


You don;t instantiate functions you call them. You are setting my_func 
to be the return value of do_stuff(). What is that return value? What 
does my_func actually refer to?



my_array = numpy.zeros([20,10,10])
# loop over parameters and fill array with desired output
for i in range(loop_1):
 for j in range(loop_2):
 for k in range(loop_3):
 # create tmp_data that has a shape which is the same as data 
except the first dimension can range from 1 - 1024 instead of being fixed at 300

 '''  Is the next line where I am causing memory problems? '''
 tmp_data = my_class.chop_data(data,i,j,k)


Again we must guess what the chop_data function is returning. Some 
sample data would be useful here.



 my_func(tmp_data)


Here you call a function but do not store any return values. Or are you 
using global variables somewhere?



 my_func.third_party_function()


But now you are accessing an attribute of my_func. What is my_func? Is 
it a function or an object? We cannot begin to guess what is going on 
without knowing that.



 my_array([i,j,k]) = my_func.results() # this is just a floating 
point number

 ''' should I do something to flush tmp_data? '''


No idea, you haven't begun to give us enough information.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

[Tutor] Cleaning up output

2013-07-03 Thread bjames

I've written my first program to take a given directory and look in all
directories below it for duplicate files (duplicate being defined as
having the same MD5 hash, which I know isn't a perfect solution, but for
what I'm doing is good enough)

My problem now is that my output file is a rather confusing jumble of
paths and I'm not sure the best way to make it more user readable.  My gut
reaction would be to go through and list by first directory, but is there
a logical way to do it so that all the groupings that have files in the
same two directories would be grouped together?

So I'm thinking I'd have:
First File Dir /some/directory/
Duplicate directories:
some/other/directory/
   Original file 1 , dupicate file 1
   Original file 2, duplicate file 2
some/third directory/
   original file 3, duplicate file 3

and so forth, where the Original file would be the file name in the First
files so that all the ones are the same there.

I fear I'm not explaining this well but I'm hoping someone can either ask
questions to help get out of my head what I'm trying to do or can decipher
this enough to help me.

Here's a git repo of my code if it helps:
https://github.com/CyberCowboy/FindDuplicates

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Cleaning up output

2013-07-03 Thread Dave Angel


On 07/03/2013 03:51 PM, bja...@jamesgang.dyndns.org wrote:

I've written my first program to take a given directory and look in all
directories below it for duplicate files (duplicate being defined as
having the same MD5 hash, which I know isn't a perfect solution, but for
what I'm doing is good enough)


This is a great first project for learning Python.  It's a utility which 
doesn't write any data to the disk (other than the result file), and 
therefore bugs won't cause much havoc.  Trust me, you will have bugs, we 
all do.  One of the things experience teaches you is how to isolate the 
damage that bugs do before they're discovered.




My problem now is that my output file is a rather confusing jumble of
paths and I'm not sure the best way to make it more user readable.  My gut
reaction would be to go through and list by first directory, but is there
a logical way to do it so that all the groupings that have files in the
same two directories would be grouped together?


I've come up with the same "presentation problem" with my own similar 
utilities.  Be assured, there's no one "right answer."


First question is have you considered what you want when there are MORE 
than two copies of one of those files?  When you know what you'd like to 
see if there are four identical files, you might have a better idea what 
you should do even for two.  Additionally, consider that two identical 
files may be in the same directory, with different names.


Anyway, if you can explain why you want a particular grouping, we might 
better understand how to accomplish it.




So I'm thinking I'd have:
First File Dir /some/directory/
Duplicate directories:
some/other/directory/
Original file 1 , dupicate file 1
Original file 2, duplicate file 2
some/third directory/
original file 3, duplicate file 3


At present, this First File Dir could be any of the directories 
involved;  without some effort, os.walk doesn't promise you any order of 
processing.  But if you want them to appear in sorted order, you can do 
sorts at key points inside your os.walk code, and they'll at least come 
out in an order that's recognizable.  (Some OS's may also sort things 
they feed to os.walk, but you'd do better not to count on it)  You also 
could sort each list in itervalues of hashdict, after the dict is fully 
populated.


Even with sorting, you run into the problem that there may be duplicates 
between some/other/directory and some/third/directory that are not in 
/some/directory.  So in the sample you show above, they won't be listed 
with the ones that are in /some/directory.




and so forth, where the Original file would be the file name in the First
files so that all the ones are the same there.

I fear I'm not explaining this well but I'm hoping someone can either ask
questions to help get out of my head what I'm trying to do or can decipher
this enough to help me.

Here's a git repo of my code if it helps:
https://github.com/CyberCowboy/FindDuplicates



At 40 lines, you should have just included it.  It's usually much better 
to include the code inline if you want any comments on it.  Think of 
what the archives are going to show in a year, when you're removed that 
repo, or thoroughly updated it.  Somebody at that time will not be able 
to make sense of comments directed at the current version of the code.


BTW, thanks for posting as text, since that'll mean that when you do 
post code, it shouldn't get mangled.


So I'll comment on the code.

You never call the dupe() function, so presumably this is a module 
intended to be used from some place else.  But if that's the case, I 
would have expected it to be factored better, at least to separate the 
input processing from the output file formatting.  That way you could 
re-use the dups logic and provide a new save formatting without 
duplicating anything.  The first function could return the hashdict, and 
the second one could analyze it to produce a particular formatted output.


The hashdict and dups variables should be initialized within the 
function, since they are not going to be used outside.  Avoid non-const 
globals.  And of course once you factor it, dups will be in the second 
function only.


You do have a if __name__ == "__main__": line, but it's inside the 
function.  Probably you meant it to be at the left margin.  And 
importing inside a conditional is seldom a good idea, though it doesn't 
matter here since you're not using the import.  Normally you want all 
your imports at the top, so they're easy to spot.  You also probably 
want a call to dupe() inside the conditional.  And perhaps some parsing 
of argv to get rootdir.


You don't mention your OS, but many OS's have symbolic links or the 
equivalent.  There's no code here to handle that possibility.  Symlinks 
are a pain to do right.  You could just add it in your docs, that no 
symlink is allowed under the rootdir.


Your open() call has no mode switch.  If you want the md5 to be 
'correct", it

Re: [Tutor] memory consumption

2013-07-03 Thread Alan Gauld


On 03/07/13 20:50, Andre' Walker-Loud wrote:


# loop over parameters and fill array with desired output
for i in range(loop_1):
 for j in range(loop_2):
 for k in range(loop_3):


How big are loop_1, loop_2, loop_3?


The sizes of the loops are not big
len(loop_1) = 20
len(loop_2) = 10
len(loop_3) = 10


This is confusing.
The fact that you are getting values for the len() of these
variables suggests they are some kind of collection? But you
are using them in range which expects number(s)

What kind of things are loop_1 etc?
What happens at the >>> prompt if you try to
>>> print range(loop_1)

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] memory consumption

2013-07-03 Thread Steven D'Aprano


On 04/07/13 08:11, Andre' Walker-Loud wrote:


Yes, I was being sloppy.  My later post clarified what I meant.
The loops are really lists, and I was really using enumerate()
to get both the iter and the element.

loop_2 = [1,2,4,8,16,32,64,128,256,512,1024]
for i,n in enumerate(loop_2):
...



Please be careful about portraying yourself as less experienced/more naive than 
you really are. Otherwise we end up wasting both our time and yours telling you 
to do things that you're already doing.

Have you googled for "Python memory leak os-x"? When I do, I find a link to 
this numpy bug:

https://github.com/numpy/numpy/issues/2969


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] memory consumption

2013-07-03 Thread Steven D'Aprano


On 04/07/13 09:24, Oscar Benjamin wrote:

On 3 July 2013 23:37, Andre' Walker-Loud  wrote:

Hi Oscar,


Hi Andre',

(your name shows in my email client with an apostrophe ' after it; I'm
not sure if I'm supposed to include that when I write it).


I expect that it's meant to be André, since that is the "correct" spelling even 
in English (however a lot of non-French Andrés drop the accent).


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Error in Game

Re: [Tutor] memory consumption

Re: [Tutor] memory consumption

Re: [Tutor] memory consumption

[Tutor] Cleaning up output

Re: [Tutor] Cleaning up output

Re: [Tutor] memory consumption

Re: [Tutor] memory consumption

Re: [Tutor] memory consumption

9 matches

Site Navigation

Mail list logo

Footer information