Re: [Tutor] Do you use unit testing?

2009-11-17 Thread Alan Gauld


"Modulok"  wrote


How many of you guys use unit testing as a development model, or at
all for that matter?


Personally I use unit testing for anything that I'm getting paid for
Occasionally I use it for something I'm doing for myself, but only
if its quite big - several modules and over say, 500 lines of code.
So for small single-file utilities etc that I only use myself I wouldn't 
bother.



I just starting messing around with it and it seems painfully slow to
have to write a test for everything you do.


Its a lot more painful and slow to have to debug subtle faults in
code you wrote 6 months ago and that don't throw exceptions
but just quietly corrupt your data...

The real  benefits of unit tests are:
a) You test stuff as you write it so you know where the bug must lie
b) You can go back and run the tests again 6 months later,
c) if you have to add a new test its easy to make sure you don't break
what used to work (regression testing).

HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Do you use unit testing?

2009-11-17 Thread Alan Gauld


"Stephen Nelson-Smith"  wrote 


As a discipline - work out what we want to test, write the test, watch
it fail, make it pass - I find this a very productive way to think and
work.


Warning:
It can be seductively addictive and lead to very bad code structure.
You start focussing on just passing the test rather than the overall 
structure and design of the solution. Make sure that once it passes 
the tests you go back and refactor the code to be clean and maintainable.


Then test it again of course!

HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] pyce, scripts complete too fast

2009-11-17 Thread Alan Gauld


"rick"  wrote


Now I'm back to the problem that a Windows newbie would have, the
scripts complete before I can see what happened!  I can get right to the
interactive mode, or I can run a script, but not sure where I'd find the
terminal.


Have a look at the "Note for Windows Users" in the Topic Add a little style
in my tutorial. It offers several ways of dealing with this issue.


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Do you use unit testing?

2009-11-17 Thread spir
Le Mon, 16 Nov 2009 13:54:26 -0700,
Modulok  stated:

> List,
> 
> A general question:
> 
> How many of you guys use unit testing as a development model, or at
> all for that matter?


I do. Systematically. Has become a reflex.
I would rather ask the opposite question: how do know a part of your code 
actually works if you don't test it?

What I do not do is writing the tests before developping. Probably because my 
style of programming is rather "exploratory", or incremental: I discover and 
expand while developping, so that I cannot predict what a relevant test may be. 
Test-driven development relies on full, accurate, detailed specification even 
before design; actually pre-development test code forms a kind of specif.

>  I just starting messing around with it and it seems painfully slow to
> have to write a test for everything you do. Thoughts, experiences,
> pros, cons?

Yes, you're right: "it seems painfully slow". It's an illusion. Just like the 
illusion that writing "user_account" rather than "usr_acnt" will slow your 
development pace. (I read once that developpers spend less than 1% of their 
time actually typing.) There's something wrong in humain brains ;-)

Writing a test func for a bit of code lets you:
* understand your code better
* find flaws and bugs before even testing
* find remaining flaws and bugs (but not all)
* validate your code (mentally, if only for yourself)
* validate changes (possibly in other code parts, related to this one)

It's much faster and better than letting unfinished code with "bombs" 
everywhere, that will randomly explode. Especially because numerous bugs won't 
look like beeing related to their actual source.

> Just looking for input and different angles on the matter, from the
> Python community.
> -Modulok-

Good question, imo!

Denis

* la vita e estrany *

http://spir.wikidot.com/



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] search the folder for the 2 files and extract out only theversion numbers 1.0 and 2.0.

2009-11-17 Thread Alan Gauld


"MARCUS NG"  wrote


what is the best way to look into the allFiles directory, and only search
the folder for the 2 files myTest1.0.zip and myTest2.0.zip to extract out
only the version numbers 1.0 and 2.0.


You probably want to use the glob.glob() function to search using
a pattern like myTest*.zip.

Then you can iterate over the list of names returned and examoine the
version number using normal sring functions (or reguar expressions)

so I did think of listing the file names in an array only to realise that 
I
wouldn't know which index to query if this script is going to be 
automated.


In Python you hardly ever need to query the index to extract items
from a list, instead you use a for loop to look at each item in turn.

for filename in glob.glob(pattern):
version = extractVersion(filename)
   # etc...

HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Unexpected Result in Test Sequence

2009-11-17 Thread Alan Gauld


 wrote in message

I'm running a test to find what the experimental average of a d20 is, and 
came across a strange bug in my code.

import random
list1 = []
def p():
   d = 0
   for number in range(1,1000):
   t = random.randrange(1,19)
   list1.append(t)
   for value in list1:
   d+=value
   print d/1000
   d = 0
for value in range(1,100):
   p()

It works, but I have a logic error somewhere. It runs, and the results 
have a pattern :


It just adds 10, and every second result, subtracts 1, till it gets to 0, 
and then starts again with 9 in singles, and whatever in the 10's, etc.


I'm not exactly sure what you expect but given you are finding the
average of a set of numbers between 1,19 I'd expect the result to be 9.5.
In other words the integer value will vary between 9 and 10, which is what
is happening.

Can you show us a segment of data that manifests the problem
more clearly than the set you have included?

HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Do you use unit testing?

2009-11-17 Thread spir
Le Tue, 17 Nov 2009 08:01:32 -,
"Alan Gauld"  stated:

Hello, Alan,

> The real  benefits of unit tests are:
> a) You test stuff as you write it so you know where the bug must lie

Yes! (tried to say the same much clearly ;-)

> b) You can go back and run the tests again 6 months later,
> c) if you have to add a new test its easy to make sure you don't break
>  what used to work (regression testing).

Do not understand this point. Would you be kind and expand on this topic? Have 
never header of regression testing (will google it, but would appreciate you 
words, too.) (I guess it may be relevant for incremental development.)

Denis

* la vita e estrany *

http://spir.wikidot.com/



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Do you use unit testing? PS

2009-11-17 Thread spir
Le Tue, 17 Nov 2009 09:56:23 +0100,
spir  stated:


> > b) You can go back and run the tests again 6 months later,
> > c) if you have to add a new test its easy to make sure you don't break
> >  what used to work (regression testing).
> 
> Do not understand this point. Would you be kind and expand on this topic?
> Have never header of regression testing (will google it, but would
> appreciate you words, too.) (I guess it may be relevant for incremental
> development.)

Sorry, sent my post too fast:
http://en.wikipedia.org/wiki/Regression_testing 

Denis

* la vita e estrany *

http://spir.wikidot.com/



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] pyce, scripts complete too fast

2009-11-17 Thread Dave Angel

rick wrote:

Perhaps the wrong list, but I'll ask anyway.  So I'm in the middle of
starting, yet again, to learn some programming.   I have 2.6.4 and 3.1
installed on the desktop (Ubuntu 9.10), 2.6.4 installed on the netbook
(UNR 9.10).   


I was thinking to be ultra portable, I'd put python on the pocket pc.
Now I'm back to the problem that a Windows newbie would have, the
scripts complete before I can see what happened!  I can get right to the
interactive mode, or I can run a script, but not sure where I'd find the
terminal.

Thanks Alan, I was going through your old tutorial, the pointer to the
new was appreciated.

Rick


  
On Windows, you can get a terminal window (aka DOS box)  by running 
"Command Prompt" from your Start menu (In recent versions it seems to be 
in Start->Accessories).   You can copy that shortcut elsewhere, of 
course.  Or you can run CMD.EXE from anywhere, like the Run menu or 
another DOS box.


Once you have a DOS box, you can set local environment variables, run 
batch files, or do other preparations before beginning your python 
program.  And of course when the python program finishes you can still 
see the results (with some amount of scrolling, buffer size can be 
adjusted).And once you're in a DOS box, you can get two independent 
ones by running Start from the first.   The new one inherits environment 
variables and current directory from the first, but does not block the 
first one from taking new commands.



You can also create a shortcut that starts a new DOS box, AND runs your 
program.  I haven't done this in so long that I forget the syntax.  But 
it's something like

   cmd.exe  /k  python.exe   script.py

By using the right switch on cmd.exe, the DOS box doesn't go away when 
the python program exits.


By the way, you can get rid of a DOS box by typing EXIT at its prompt, 
or by clicking the X on the upper right corner of the window.


DaveA

DaveA

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Use of 'or'

2009-11-17 Thread Stephen Nelson-Smith
A friend of mine mentioned what he called the 'pythonic' idiom of:

print a or b

Isn't this a 'clever' kind or ternary - an if / else kind of thing?

I don't warm to it... should I?

S.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Use of 'or'

2009-11-17 Thread Tim Golden

Stephen Nelson-Smith wrote:

A friend of mine mentioned what he called the 'pythonic' idiom of:

print a or b

Isn't this a 'clever' kind or ternary - an if / else kind of thing?


I would say it's perfectly idiomatic in Python, but
not as a ternary. If you want a ternary use the
(relatively) recently-introduced:

a if  else b

eg:

things = [1, 2 ,3]
print "%d thing%s found" % (len (things), "" if len (things) == 1 else "s")

Before this operator came along, people used to use
more-or-less clever-clever tricks with boolean operators
or using the fact that bool / int are interchangeable:

['s', ''][len (things) == 1]

The "a or b" idiom is most commonly used for default-type
situations:

def f (a, b=None):
 print "a", b or ""



I don't warm to it... should I?


Up to you :) But I regard it as a useful idiom.

TJG
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Introduction - log exercise

2009-11-17 Thread Antonio de la Fuente
Hi everybody,

This is my first post here. I have started learning python and I am new to
programing, just some bash scripting, no much. 
Thank you for the kind support and help that you provide in this list.

This is my problem: I've got a log file that is filling up very quickly, this
log file is made of blocks separated by a blank line, inside these blocks there
is a line "foo", I want to discard blocks with that line inside it, and create a
new log file, without those blocks, that will reduce drastically the size of the
log file. 

The log file is gziped, so I am going to use gzip module, and I am going to pass
the log file as an argument, so sys module is required as well.

I will read lines from file, with the 'for loop', and then I will check them for
'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
list, and if there is no matches for foo, I will append line to the list. When I
get to a blank line (end of block), write myList to an external file. And start
with another line.

I am stuck with defining 'blank line', I don't manage to get throught the while
loop, any hint here I will really appreciate it.
I don't expect the solution, as I think this is a great exercise to get wet
with python, but if anyone thinks that this is the wrong way of solving the
problem, please let me know.


#!/usr/bin/python

import sys
import gzip

myList = []

# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn = gzip.open(sys.argv[1])

fileIn = gzip.open('big_log_file.gz', 'r')
fileOut = open('outputFile', 'a')

for line in fileIn:
while line != 'blank_line':
if line == 'foo':
Somehow re-initialise myList
break
else:
myList.append(line)
fileOut.writelines(myList)


Somehow rename outputFile with big_log_file.gz

fileIn.close()
fileOut.close()

-

The log file will be fill with:


Tue Nov 17 16:11:47 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
tatati tatata

Tue Nov 17 16:12:58 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
foo
tatati tatata

Tue Nov 17 16:13:42 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
tatati tatata


etc, etc ,etc
..

Again, thank you.

-- 
-
Antonio de la Fuente Martínez
E-mail: t...@muybien.org
-

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Wayne Werner
On Tue, Nov 17, 2009 at 10:58 AM, Antonio de la Fuente wrote:

> Hi everybody,
>
> This is my first post here. I have started learning python and I am new to
> programing, just some bash scripting, no much.
> Thank you for the kind support and help that you provide in this list.
>

You're welcome!


>
> This is my problem: I've got a log file that is filling up very quickly,
> this
> log file is made of blocks separated by a blank line, inside these blocks
> there
> is a line "foo", I want to discard blocks with that line inside it, and
> create a
> new log file, without those blocks, that will reduce drastically the size
> of the
> log file.
>
> The log file is gziped, so I am going to use gzip module, and I am going to
> pass
> the log file as an argument, so sys module is required as well.
>
> I will read lines from file, with the 'for loop', and then I will check
> them for
> 'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> list, and if there is no matches for foo, I will append line to the list.
> When I
> get to a blank line (end of block), write myList to an external file. And
> start
> with another line.
>
> I am stuck with defining 'blank line', I don't manage to get throught the
> while
> loop, any hint here I will really appreciate it.
>

Let me start out by saying this; I'm very impressed with the thought you've
put into the problem and the way you've presented it.

The first thing that pops into my mind is to simply strip whitespace from
the line and check if the line == ''. Upon further experiment there's the
"isspace" method:

In [24]: x = '   \n\n\r\t\t'

In [25]: x.isspace()
Out[25]: True

x contains a bunch of spaces, newlines, and tab chars. From the docs:

" Return True if all characters in S are whitespace
and there is at least one character in S, False otherwise."



> I don't expect the solution, as I think this is a great exercise to get wet
> with python, but if anyone thinks that this is the wrong way of solving the
> problem, please let me know.
> 
> for line in fileIn:
>while line != 'blank_line':
>if line == 'foo':
>Somehow re-initialise myList
>break
>else:
>myList.append(line)
>fileOut.writelines(myList)

 


Rather than using a while, you can use an if statement with the space method
(and join your statement with an:

if not line.isspace() and not line == 'foo':
fileOut.write(line)

then you can get rid of the whole myList. Based on what you've written,
there's really no need to have a list, it's more efficient to just write the
line straight to the file.

for the renaming part, take a look at the shutil module.

HTH,
Wayne
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Nick Stinemates
> I will read lines from file, with the 'for loop', and then I will check them 
> for
> 'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> list, and if there is no matches for foo, I will append line to the list. 
> When I
> get to a blank line (end of block), write myList to an external file. And 
> start

Can you please explain what you mean by _re-initialize the list_ ?

> with another line.
> 
> I am stuck with defining 'blank line', I don't manage to get throught the 
> while
> loop, any hint here I will really appreciate it.
> I don't expect the solution, as I think this is a great exercise to get wet
> with python, but if anyone thinks that this is the wrong way of solving the
> problem, please let me know.
> 
> 

Can you explain why the following won't work?


#!/usr/bin/python
 
import sys
import gzip
# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn = gzip.open(sys.argv[1])

fileIn = gzip.open('big_log_file.gz', 'r')
fileOut = open('outputFile', 'a')

for line in fileIn:
while line != 'blank_line':
if 'foo' not in line:
fileOut.write(line)
> 
> 
> Somehow rename outputFile with big_log_file.gz
> 
> fileIn.close()
> fileOut.close()
> 
> -
> 
> The log file will be fill with:
> 
> 
> Tue Nov 17 16:11:47 GMT 2009
>   bladi bladi bla
>   tarila ri la
>   patatin pataton
>   tatati tatata
> 
> Tue Nov 17 16:12:58 GMT 2009
>   bladi bladi bla
>   tarila ri la
>   patatin pataton
>   foo
>   tatati tatata
> 
> Tue Nov 17 16:13:42 GMT 2009
>   bladi bladi bla
>   tarila ri la
>   patatin pataton
>   tatati tatata
> 
> 
> etc, etc ,etc
> ..
> 
> Again, thank you.
> 
> -- 
> -
> Antonio de la Fuente Mart?nez
> E-mail: t...@muybien.org
> -
> 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Processing rows from CSV

2009-11-17 Thread mhw
Dear Tutors,

A rather general question, I'm afraid. I have found myself writing some python 
code to handle some CSV data, using the csv. DictReader that generates a dict 
for each row with the key as the column heading and the value in the file as 
the item. Most operations involve code of the form: (Apologies for incorrect 
caps)

For row in reader:
If row['foo'] == 'something' :
do this etc.

Typically I'm checking some form of consistency, or adding an element to the 
row based on something in the row.

I know about the existence of awk/ sed etc. Which could do some of this, but I 
ran into some problems with date manipulation, etc that they don't seem to 
handle very well.

I wanted to ask if anyone knew of anything similar, as I'm worried about 
re-inventing the wheel. One option would be to use (e.g.) sqlite and then use 
select/ insert commands, but personally I'd much rather write conditions in 
python than sql.

My other question was about how I do this more efficiently: I don't want to 
read the whole thing into memory at once, but (AFAIK) csv.DictReader doesn't 
support something like readline() to deliver it one at a time.

Any comments gratefully received.

Matt
Sent from my BlackBerry® wireless device
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] opening a file directly from memory

2009-11-17 Thread mjekl

Alan Gauld wrote:

 wrote
Yes. My program knows. A database column stores the complete file  
name  (including extension), and I can be certain the applications  
will be  available to run the file.


You still haven't answered the question.

We have established that
a) The OS knows what program is associated with the file
b) your program knows the file name and location

But does your program - or more specifically you as the author -  
know what program the OS will use to open the file?


Yes. My program knows which program to use (that will be installed).
If you do you can call it explicitly, but if you do not then you  
need to find a way of getting the OS to tell you, or to leave it to  
the OS.
I'm interested in nthis for the sake of generalizing (which is  
better). How can I get the OS to tell me which program to use. Or  
alternatively, how to tell the OS to open it - assuming that since the  
os knows which program to use it will just use it (perhaps too big an  
assumption ;-)


For example a txt file could be opened by any of hundreds of text  
editors, depending on what the user selected, but a Photoshop file  
will typically only be opened by photoshop.


HTH,


Txs,
Miguel



VIVA os SEUS SONHOS com o Crédito Pessoal Capital Mais.
Peça aqui até 15.000 Euros: http://www.iol.pt/correio/rodape.php?dst=0901052
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Antonio de la Fuente
* Wayne Werner  [2009-11-17 11:41:25 -0600]:

> Date: Tue, 17 Nov 2009 11:41:25 -0600
> From: Wayne Werner 
> To: Antonio de la Fuente 
> Cc: Python Tutor mailing list 
> Subject: Re: [Tutor] Introduction - log exercise
> Message-ID: <333efb450911170941g709e7ea3l4b4316044be09...@mail.gmail.com>
> 
> On Tue, Nov 17, 2009 at 10:58 AM, Antonio de la Fuente 
> wrote:
> 
> > Hi everybody,
> >
> > This is my first post here. I have started learning python and I am new to
> > programing, just some bash scripting, no much.
> > Thank you for the kind support and help that you provide in this list.
> >
> 
> You're welcome!
> 
> 
> >
> > This is my problem: I've got a log file that is filling up very quickly,
> > this
> > log file is made of blocks separated by a blank line, inside these blocks
> > there
> > is a line "foo", I want to discard blocks with that line inside it, and
> > create a
> > new log file, without those blocks, that will reduce drastically the size
> > of the
> > log file.
> >
[...] 
> 
> Let me start out by saying this; I'm very impressed with the thought you've
> put into the problem and the way you've presented it.
> 

Thank you, it took me some time, but it did help me to understand
better the problem.

> The first thing that pops into my mind is to simply strip whitespace from
> the line and check if the line == ''. Upon further experiment there's the
> "isspace" method:
> 
> In [24]: x = '   \n\n\r\t\t'
> 
> In [25]: x.isspace()
> Out[25]: True
> 
> x contains a bunch of spaces, newlines, and tab chars. From the docs:
> 
> " Return True if all characters in S are whitespace
> and there is at least one character in S, False otherwise."
> 

So, I could use it like this:
while not line.isspace()
  if line == 'foo':
   Somehow re-initialise myList 
   break
   [and the rest]

> 
> 
> > I don't expect the solution, as I think this is a great exercise to get wet
> > with python, but if anyone thinks that this is the wrong way of solving the
> > problem, please let me know.
> > 
> > for line in fileIn:
> >while line != 'blank_line':
> >if line == 'foo':
> >Somehow re-initialise myList
> >break
> >else:
> >myList.append(line)
> >fileOut.writelines(myList)
> 
>  
> 
> 
> Rather than using a while, you can use an if statement with the space method
> (and join your statement with an:
> 
> if not line.isspace() and not line == 'foo':
> fileOut.write(line)
> 
But then, the new log file will have all the blocks, even the ones that
had 'foo' on it, even if the foo lines weren't there anymore. No? or
is there anything that I don't get?

> then you can get rid of the whole myList. Based on what you've written,
> there's really no need to have a list, it's more efficient to just write the
> line straight to the file.
My idea of using a list was that I can put the block in the list, and
if block does not content line with 'foo' then write it to file,
otherwise discard block.
> 
> for the renaming part, take a look at the shutil module.
> 
> HTH,
> Wayne
Thank you Wayne.
-- 
-
Antonio de la Fuente Martínez
E-mail: t...@muybien.org
-

Dios mira las manos limpias, no las llenas.
-- Publio Siro. 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread bob gailer

Antonio de la Fuente wrote:

Hi everybody,

This is my first post here. I have started learning python and I am new to
programing, just some bash scripting, no much. 
Thank you for the kind support and help that you provide in this list.


This is my problem: I've got a log file that is filling up very quickly, this
log file is made of blocks separated by a blank line, inside these blocks there
is a line "foo", I want to discard blocks with that line inside it, and create a
new log file, without those blocks, that will reduce drastically the size of the
log file. 


The log file is gziped, so I am going to use gzip module, and I am going to pass
the log file as an argument, so sys module is required as well.

I will read lines from file, with the 'for loop', and then I will check them for
'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
list, and if there is no matches for foo, I will append line to the list. When I
get to a blank line (end of block), write myList to an external file. And start
with another line.

I am stuck with defining 'blank line', I don't manage to get throught the while
loop, any hint here I will really appreciate it.
I don't expect the solution, as I think this is a great exercise to get wet
with python, but if anyone thinks that this is the wrong way of solving the
problem, please let me know.


#!/usr/bin/python

import sys
import gzip

myList = []

# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn = gzip.open(sys.argv[1])

fileIn = gzip.open('big_log_file.gz', 'r')
fileOut = open('outputFile', 'a')

for line in fileIn:
while line != 'blank_line':
if line == 'foo':
Somehow re-initialise myList
break
else:
myList.append(line)
fileOut.writelines(myList)
  

Observations:
0 - The other responses did not understand your desire to drop any  
paragraph containing 'foo'.

1 - The while loop will run forever, as it keeps processing the same line.
2 - In your sample log file the line with 'foo' starts with a tab. line 
== 'foo' will always be false.

3 - Is the first line in the file Tue Nov 17 16:11:47 GMT 2009 or blank?
4 - Is the last line blank?

Better logic:

# open files
paragraph = []
keep = True
for line in fileIn:
 if line.isspace(): # end of paragraph
   if keep:
 outFile.writelines(paragraph)
   paragraph = []
   keep = True
 else:
   if keep:
 if line == '\tfoo':
   keep = False
 else:
   paragraph.append(line)
# anticipating last line not blank, write last paragraph
if keep:
  outFile.writelines(paragraph)

# use shutil to rename


--
Bob Gailer
Chapel Hill NC
919-636-4239
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Wayne Werner
On Tue, Nov 17, 2009 at 2:23 PM, Antonio de la Fuente wrote:

> But then, the new log file will have all the blocks, even the ones that
> had 'foo' on it, even if the foo lines weren't there anymore. No? or
> is there anything that I don't get?
>

Ah yes, I forgot about that part.

So you should probably keep the list.

-Wayne

-- 
To be considered stupid and to be told so is more painful than being called
gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness,
every vice, has found its defenders, its rhetoric, its ennoblement and
exaltation, but stupidity hasn’t. - Primo Levi
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Antonio de la Fuente
* Nick Stinemates  [2009-11-17 13:30:51 -0500]:

> Date: Tue, 17 Nov 2009 13:30:51 -0500
> From: Nick Stinemates 
> To: Antonio de la Fuente 
> Cc: Python Tutor mailing list 
> Subject: Re: [Tutor] Introduction - log exercise
> Mail-Followup-To: Antonio de la Fuente ,
>   Python Tutor mailing list 
> User-Agent: Mutt/1.5.20 (2009-06-14)
> Message-ID: <20091117183051.ga20...@stinemates.org>
> 
> > I will read lines from file, with the 'for loop', and then I will check 
> > them for
> > 'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> > list, and if there is no matches for foo, I will append line to the list. 
> > When I
> > get to a blank line (end of block), write myList to an external file. And 
> > start
> 
> Can you please explain what you mean by _re-initialize the list_ ?
>

By re-initialize the list I was thinking to get rid of the lines that
have been put into it, until it has been found one with 'foo', if not
it will contaminated the new log file, with lines from blocks that had
'foo' lines.

> > with another line.
> > 
> > I am stuck with defining 'blank line', I don't manage to get throught the 
> > while
> > loop, any hint here I will really appreciate it.
> > I don't expect the solution, as I think this is a great exercise to get wet
> > with python, but if anyone thinks that this is the wrong way of solving the
> > problem, please let me know.
> > 
> > 
> 
> Can you explain why the following won't work?
>

Because I don't know how to define a blank line, that will allow me to
differentiate between blocks.

> 
> #!/usr/bin/python
>  
> import sys
> import gzip
> # At the moment not bother with argument part as I am testing it with a
> # testing log file
> #fileIn = gzip.open(sys.argv[1])
> 
> fileIn = gzip.open('big_log_file.gz', 'r')
> fileOut = open('outputFile', 'a')
> 
> for line in fileIn:
> while line != 'blank_line':
> if 'foo' not in line:
> fileOut.write(line)
[...] 

-- 
-
Antonio de la Fuente Martínez
E-mail: t...@muybien.org
-

El que con niños se acuesta, mojado se levanta. 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Processing rows from CSV

2009-11-17 Thread Kent Johnson
On Tue, Nov 17, 2009 at 1:57 PM,   wrote:

> A rather general question, I'm afraid. I have found myself writing some 
> python code to handle some CSV data, using the csv. DictReader that generates 
> a dict for each row with the key as the column heading and the value in the 
> file as the item. Most operations involve code of the form: (Apologies for 
> incorrect caps)
>
> For row in reader:
>    If row['foo'] == 'something' :
>        do this etc.

> My other question was about how I do this more efficiently: I don't want to 
> read the whole thing into memory at once, but (AFAIK) csv.DictReader doesn't 
> support something like readline() to deliver it one at a time.

The loop above does deliver one line worth of data at a time.

Kent
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Dave Angel

Antonio de la Fuente wrote:

Hi everybody,

This is my first post here. I have started learning python and I am new to
programing, just some bash scripting, no much. 
Thank you for the kind support and help that you provide in this list.


This is my problem: I've got a log file that is filling up very quickly, this
log file is made of blocks separated by a blank line, inside these blocks there
is a line "foo", I want to discard blocks with that line inside it, and create a
new log file, without those blocks, that will reduce drastically the size of the
log file. 


The log file is gziped, so I am going to use gzip module, and I am going to pass
the log file as an argument, so sys module is required as well.

I will read lines from file, with the 'for loop', and then I will check them for
'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
list, and if there is no matches for foo, I will append line to the list. When I
get to a blank line (end of block), write myList to an external file. And start
with another line.

I am stuck with defining 'blank line', I don't manage to get throught the while
loop, any hint here I will really appreciate it.
I don't expect the solution, as I think this is a great exercise to get wet
with python, but if anyone thinks that this is the wrong way of solving the
problem, please let me know.


#!/usr/bin/python

import sys
import gzip

myList =]

# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn =zip.open(sys.argv[1])

fileIn =zip.open('big_log_file.gz', 'r')
fileOut =pen('outputFile', 'a')

for line in fileIn:
while line !=blank_line':
if line ='foo':
Somehow re-initialise myList
break
else:
myList.append(line)
fileOut.writelines(myList)


Somehow rename outputFile with big_log_file.gz

fileIn.close()
fileOut.close()

-

The log file will be fill with:


Tue Nov 17 16:11:47 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
tatati tatata

Tue Nov 17 16:12:58 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
foo
tatati tatata

Tue Nov 17 16:13:42 GMT 2009
bladi bladi bla
tarila ri la
patatin pataton
tatati tatata


etc, etc ,etc
..

Again, thank you.

  
You've got some good ideas, and I'm going to give you hints, rather than 
just writing it for you, as you suggested.


First, let me point out that there are advanced features in Python that 
could make a simple program that'd be very hard for a beginner to 
understand.  I'll give you the words, but recommend that you not try it 
at this time.  If you were to wrap the file in a generator that returned 
you a "paragraph" at a time, the same way as it's now returning a line 
at a time, then the loop would be simply a for-loop on that generator, 
checking each paragraph for whether it contained "foo" and if so, 
writing it to the output.



But you can also do it without using advanced features, and that's what 
I'm going to try to outline.


Two things you'll be testing each line for:  is it blank, and is it "foo".
  if line.isspace()  will test if a line is whitespace only, as Wayne 
pointed out.
  if line == "foo" will test if a line has exactly "foo" in it.  But if 
you apparently have leading whitespace, and

trailing newlines, and if they're irrelevant, then you might want
  if line.strip() == "foo"

I would start by just testing for blank lines.   Try replacing all blank 
lines with "* blank line "  and print each
line. See whether the output makes sense.  if it does, go on to the next 
step.

  for line in 
if line-is-blank
  line-is-fancy-replacement
   print line

Now, instead of just printing the line, add it to a list object.  Create 
an object called paragraph(rather than a file) as an empty list object, 
before the for loop.
Inside the for loop, if the line is non-empty, add it to the paragraph.  
If the line is empty, then print the paragraph (with something before 
and after it,
so you can see what came from each print stmt).  Then blank it (outlist 
= []).

Check whether this result looks good, and if so, continue on.

Next version of the code: whenever you have a non-blank line, in 
addition to adding it to the list, also check it for whether it's  
equal-foo.
If so, set a flag.  When printing the outlist, skip the printing if the 
flag is set.  Remember that you'll have to clear this flag each time you 
blank

the mylist, both before the loop, and in the middle of the loop.

Once this makes sense, you can worry about actually writing the output 
to a real file, maybe compressing it, maybe doing deletes and renames
as appropriate.   You probably don't need shutil module,  os module 
probably has enough functions for th

Re: [Tutor] Do you use unit testing?

2009-11-17 Thread Kent Johnson
On Mon, Nov 16, 2009 at 3:54 PM, Modulok  wrote:
> List,
>
> A general question:
>
> How many of you guys use unit testing as a development model, or at
> all for that matter?
>
>  I just starting messing around with it and it seems painfully slow to
> have to write a test for everything you do. Thoughts, experiences,
> pros, cons?

In my opinion TDD is awesome :-) Why?

- It gives you high confidence that your code does what you think it
does, and that you can rely on it
- It can be a lot of fun to write a test, then make it pass,
repeat...each time you make the test pass is a little pat on the back
- good work!
- It forces you to think as a client of the code that you are writing
- It puts pressure on your design, to write code that is
  - testable (duh)
  - loosely coupled
  - reusable (to some extent), because you start with two clients

That's just the benefit when you initially write the code. Just as
important is the benefit downstream, when you want to make a change.
Having a good test suite allows you to make changes to the code (for
example refactoring) with confidence that if you break anything, you
will find out quickly.

This is a profound shift. On projects without tests, there tends to be
code whose function is not clear, or whose structure is tangled, which
no one wants to touch for fear of breaking something. Generally there
is an undercurrent of fear of breakage that promotes code rot.

On projects with tests, fear is replaced with confidence. Messes can
be cleaned up, dead code stripped out, etc. So tests promote a healthy
codebase over time.

There is a cost but it is smaller than it looks from the outside. Yes,
you have to learn to use a test framework, but that is a one-time
cost. I use unittest because it follows the xUnit pattern so it is
familiar across Python, Java, C#, etc. There are alternatives such as
doctest, nose and py.test that some people prefer. I have written a
brief intro to unittest here, with links to comparison articles:
http://personalpages.tds.net/~kent37/kk/00014.html

You will occasionally have to stop and figure out how to test
something new and perhaps write some test scaffolding. That time will
pay off hugely as you write tests.

You have to write the tests, but what were you doing before? How do
you know your code works? You must be doing some kind of tests. If you
write your tests as automated unit tests they are preserved and useful
in the future, and probably more comprehensive than what you would do
by hand. If you test by hand, your tests are lost when you finish.

Finally I confess that GUI unit tests are very difficult. I have often
omitted them, instead trying to write a thin GUI over a testable layer
of functionality. In my current (WinForms, C#) project we have found a
workable way to write GUI tests using NUnitForms so in the future
maybe I will be writing more GUI unit tests.

More advocacy:
http://c2.com/cgi/wiki?TestDrivenDevelopment
http://c2.com/cgi/wiki?CodeUnitTestFirst

Kent
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] opening a file directly from memory

2009-11-17 Thread Alan Gauld
 wrote 

If you do you can call it explicitly, but if you do not then you  
need to find a way of getting the OS to tell you, or to leave it to  
the OS.


I'm interested in nthis for the sake of generalizing (which is  
better). How can I get the OS to tell me which program to use. 
alternatively, how to tell the OS to open it - assuming that since the  
os knows which program to use it will just use it 


This is where it gets messy.

The solution depends on the OS. If its Windows you can use Startfile.
If its MacOS you can interrogate the package manifest.
If its another Unix you can use any of several options depending on the 
flavour. The simplest is, I think, the file command, then if its a 
text file you can check the shebang line at the top of the file. But 
modern Unices, like Linux have file association tables - but these 
are often associated with the desktop environment - KDE, Gnome etc.
Finally for text files you should check the EDITOR and VISUAL 
environment variables - although these are increasingly not 
used or respected nowadays.


So you could write a program that checked the OS and then tried 
all of these options to identify the host application. But even then 
you are not guaranteed to succeed!


Finally you can try just running the file via os.system or the 
subprocess module and see what happens!


But there is no foolproof way of doing it on all OS. That's why 
its easier if you either know what app to use or create a 
config file such that the user can specify the app at install time.
On unix that would traditionally be called .myapprc aand be 
stored in each users home directory.


On Windows it would either be a registry entry or a myapp.ini file,
usually stored in the Windows directory or (better IMHO but against 
MS guidelines) in the app directory.


HTH,

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] opening a file directly from memory

2009-11-17 Thread Alan Plum
Dammit. Meant to send this over the list. Sorry, Alan.

On Di, 2009-11-17 at 21:33 +, Alan Gauld wrote:
> Unices, like Linux have file association tables - but these 
> are often associated with the desktop environment - KDE, Gnome etc.
> Finally for text files you should check the EDITOR and VISUAL 
> environment variables - although these are increasingly not 
> used or respected nowadays.

I find mimemagic to be quite reliable. File can be silly at times (it
routinely diagnoses my python scripts as Java source files). In my
gopher server I rely on mimemagic to make an educated guess and only use
file(1) as a last resort to find out whether it's a binary file or ASCII
text (some gopher clients don't support UTF8).

I'm not sure whether mimemagic is also available on other OSes, tho.


Cheers

Alan


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] opening a file directly from memory

2009-11-17 Thread Kent Johnson
On Tue, Nov 17, 2009 at 1:37 PM,   wrote:
>> Alan Gauld wrote:
>>>
>>>  wrote

>>> If you do you can call it explicitly, but if you do not then you need to
>>> find a way of getting the OS to tell you, or to leave it to the OS.
>
> I'm interested in nthis for the sake of generalizing (which is better). How
> can I get the OS to tell me which program to use. Or alternatively, how to
> tell the OS to open it - assuming that since the os knows which program to
> use it will just use it (perhaps too big an assumption ;-)

The "desktop" package provides a portable open() function:
http://pypi.python.org/pypi/desktop

It came out of this discussion (extensive, with links):
http://bugs.python.org/issue1301512

Kent
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Alan Gauld


"Antonio de la Fuente"  wrote 


> if not line.isspace() and not line == 'foo':
> fileOut.write(line)

But then, the new log file will have all the blocks, even the ones that
had 'foo' on it, even if the foo lines weren't there anymore. No? or
is there anything that I don't get?


I think the test should be:

if not line.isspace and 'foo' not in line:
fileOut.write(line)

HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Antonio de la Fuente
* bob gailer  [2009-11-17 15:26:20 -0500]:

> Date: Tue, 17 Nov 2009 15:26:20 -0500
> From: bob gailer 
> To: Antonio de la Fuente 
> CC: Python Tutor mailing list 
> Subject: Re: [Tutor] Introduction - log exercise
> User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
> Message-ID: <4b0306ec.8000...@gmail.com>
> 
> Antonio de la Fuente wrote:
> >Hi everybody,
> >
> >This is my first post here. I have started learning python and I am new to
> >programing, just some bash scripting, no much. Thank you for the
> >kind support and help that you provide in this list.
> >
> >This is my problem: I've got a log file that is filling up very quickly, this
> >log file is made of blocks separated by a blank line, inside these blocks 
> >there
> >is a line "foo", I want to discard blocks with that line inside it, and 
> >create a
> >new log file, without those blocks, that will reduce drastically the size of 
> >the
> >log file.
> >
> >The log file is gziped, so I am going to use gzip module, and I am going to 
> >pass
> >the log file as an argument, so sys module is required as well.
> >
> >I will read lines from file, with the 'for loop', and then I will check them 
> >for
> >'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> >list, and if there is no matches for foo, I will append line to the list. 
> >When I
> >get to a blank line (end of block), write myList to an external file. And 
> >start
> >with another line.
> >
> >I am stuck with defining 'blank line', I don't manage to get throught the 
> >while
> >loop, any hint here I will really appreciate it.
> >I don't expect the solution, as I think this is a great exercise to get wet
> >with python, but if anyone thinks that this is the wrong way of solving the
> >problem, please let me know.
> >
> >
> >#!/usr/bin/python
> >
> >import sys
> >import gzip
> >
> >myList = []
> >
> ># At the moment not bother with argument part as I am testing it with a
> ># testing log file
> >#fileIn = gzip.open(sys.argv[1])
> >
> >fileIn = gzip.open('big_log_file.gz', 'r')
> >fileOut = open('outputFile', 'a')
> >
> >for line in fileIn:
> >while line != 'blank_line':
> >if line == 'foo':
> >Somehow re-initialise myList
> > break
> >else:
> >myList.append(line)
> >fileOut.writelines(myList)
> Observations:
> 0 - The other responses did not understand your desire to drop any
> paragraph containing 'foo'.

Yes, paragraph == block, that's it

> 1 - The while loop will run forever, as it keeps processing the same line.

Because the tabs in the line with foo?!

> 2 - In your sample log file the line with 'foo' starts with a tab.
> line == 'foo' will always be false.

So I need first to get rid of those tabs, right? I can do that with
line.strip(), but then I need the same formatting for the fileOut.

> 3 - Is the first line in the file Tue Nov 17 16:11:47 GMT 2009 or blank?

First line is Tue Nov 17 16:11:47 GMT 2009

> 4 - Is the last line blank?

last line is blank.

> 
> Better logic:
> 
I would have never thought this way of solving the problem. Interesting.
> # open files
> paragraph = []
> keep = True
> for line in fileIn:
>  if line.isspace(): # end of paragraph 

Aha! finding the blank line

>if keep:
>  outFile.writelines(paragraph)
>paragraph = []

This is what I called re-initialising the list.

>keep = True
>  else:
>if keep:
>  if line == '\tfoo':
>keep = False
>  else:
>paragraph.append(line)
> # anticipating last line not blank, write last paragraph
> if keep:
>   outFile.writelines(paragraph)
> 
> # use shutil to rename
> 
Thank you.

> 
> -- 
> Bob Gailer
> Chapel Hill NC
> 919-636-4239

-- 
-
Antonio de la Fuente Martínez
E-mail: t...@muybien.org
-

The problem with people who have no vices is that generally you can
be pretty sure they're going to have some pretty annoying virtues.
-- Elizabeth Taylor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Antonio de la Fuente
* Dave Angel  [2009-11-17 16:30:43 -0500]:

> Date: Tue, 17 Nov 2009 16:30:43 -0500
> From: Dave Angel 
> To: Antonio de la Fuente 
> CC: Python Tutor mailing list 
> Subject: Re: [Tutor] Introduction - log exercise
> User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
> Message-ID: <4b031603.4000...@ieee.org>
> 
> Antonio de la Fuente wrote:
> >Hi everybody,
> >

[...] 

> >This is my problem: I've got a log file that is filling up very quickly, this
> >log file is made of blocks separated by a blank line, inside these blocks 
> >there
> >is a line "foo", I want to discard blocks with that line inside it, and 
> >create a
> >new log file, without those blocks, that will reduce drastically the size of 
> >the
> >log file.
> >

[...] 

> You've got some good ideas, and I'm going to give you hints, rather
> than just writing it for you, as you suggested.
> 
Much appreciated, really.

> First, let me point out that there are advanced features in Python
> that could make a simple program that'd be very hard for a beginner
> to understand.  I'll give you the words, but recommend that you not
> try it at this time.  If you were to wrap the file in a generator
> that returned you a "paragraph" at a time, the same way as it's now
> returning a line at a time, then the loop would be simply a for-loop
> on that generator, checking each paragraph for whether it contained
> "foo" and if so, writing it to the output.
> 
> 
> But you can also do it without using advanced features, and that's
> what I'm going to try to outline.
> 

[...] 

> Inside the for loop, if the line is non-empty, add it to the
> paragraph.  If the line is empty, then print the paragraph (with
> something before and after it,
> so you can see what came from each print stmt).  Then blank it
> (outlist = []).
> Check whether this result looks good, and if so, continue on.

for line in fileIn: 
 
if line.isspace():  
 
print "* blank line "   
 
print myList
 
print "* fancy blank line "
myList = [] 
 
else:   
 
myList.append(line)   

I think is what i expect, but confuse me that is in this format:

['Tue Nov 17 16:11:47 GMT 2009\n'], '\tbladi bladi bla', '\ttarila ri la\n', 
'\tpatatin pataton\n', '\ttatati tatata\n', '\tfoo\n']
* fancy blank line 
* blank line 

with linefeeds and tabs all over, I see why everybody calls it
paragraph.
Once I write to a file from the list, it will comeback the initial
format of the file?

> 
> Next version of the code: whenever you have a non-blank line, in
> addition to adding it to the list, also check it for whether it's
> equal-foo.
> If so, set a flag.  When printing the outlist, skip the printing if
> the flag is set.  Remember that you'll have to clear this flag each
> time you blank
> the mylist, both before the loop, and in the middle of the loop.

I am a bit lost with the flags, is it what Bob Gailer was calling keep =
True, keep = False, right?

> 
> Once this makes sense, you can worry about actually writing the
> output to a real file, maybe compressing it, maybe doing deletes and
> renames
> as appropriate.   You probably don't need shutil module,  os module
> probably has enough functions for this.
> 
> At any of these stages, if you get stuck, call for help.  But your
> code will be only as complex as that stage needs, so we can find one
> bug at a time.
> 
> DaveA
> 
Thank you, it has been very helpful.

-- 
-
Antonio de la Fuente Martínez
E-mail: t...@muybien.org
-

Power, like a desolating pestilence,
Pollutes whate'er it touches...
-- Percy Bysshe Shelley
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread bob gailer

Alan Gauld wrote:


"Antonio de la Fuente"  wrote

> if not line.isspace() and not line == 'foo':
> fileOut.write(line)

But then, the new log file will have all the blocks, even the ones that
had 'foo' on it, even if the foo lines weren't there anymore. No? or
is there anything that I don't get?


I think the test should be:

if not line.isspace and 'foo' not in line:
fileOut.write(line)



No - that misses the objective of eliminating blocks containing 'foo'


--
Bob Gailer
Chapel Hill NC
919-636-4239
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread bob gailer

Antonio de la Fuente wrote:

* bob gailer  [2009-11-17 15:26:20 -0500]:

  

Date: Tue, 17 Nov 2009 15:26:20 -0500
From: bob gailer 
To: Antonio de la Fuente 
CC: Python Tutor mailing list 
Subject: Re: [Tutor] Introduction - log exercise
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
Message-ID: <4b0306ec.8000...@gmail.com>

Antonio de la Fuente wrote:


Hi everybody,

This is my first post here. I have started learning python and I am new to
programing, just some bash scripting, no much. Thank you for the
kind support and help that you provide in this list.

This is my problem: I've got a log file that is filling up very quickly, this
log file is made of blocks separated by a blank line, inside these blocks there
is a line "foo", I want to discard blocks with that line inside it, and create a
new log file, without those blocks, that will reduce drastically the size of the
log file.

The log file is gziped, so I am going to use gzip module, and I am going to pass
the log file as an argument, so sys module is required as well.

I will read lines from file, with the 'for loop', and then I will check them for
'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
list, and if there is no matches for foo, I will append line to the list. When I
get to a blank line (end of block), write myList to an external file. And start
with another line.

I am stuck with defining 'blank line', I don't manage to get throught the while
loop, any hint here I will really appreciate it.
I don't expect the solution, as I think this is a great exercise to get wet
with python, but if anyone thinks that this is the wrong way of solving the
problem, please let me know.


#!/usr/bin/python

import sys
import gzip

myList = []

# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn = gzip.open(sys.argv[1])

fileIn = gzip.open('big_log_file.gz', 'r')
fileOut = open('outputFile', 'a')

for line in fileIn:
   while line != 'blank_line':
   if line == 'foo':
   Somehow re-initialise myList
break
   else:
   myList.append(line)
   fileOut.writelines(myList)
  

Observations:
0 - The other responses did not understand your desire to drop any
paragraph containing 'foo'.



Yes, paragraph == block, that's it

  

1 - The while loop will run forever, as it keeps processing the same line.



Because the tabs in the line with foo?!
  


No - because within the loop there is nothing reading the next line of 
the file!
  

2 - In your sample log file the line with 'foo' starts with a tab.
line == 'foo' will always be false.



So I need first to get rid of those tabs, right? I can do that with
line.strip(), but then I need the same formatting for the fileOut.

  

3 - Is the first line in the file Tue Nov 17 16:11:47 GMT 2009 or blank?



First line is Tue Nov 17 16:11:47 GMT 2009

  

4 - Is the last line blank?



last line is blank.

  

Better logic:



I would have never thought this way of solving the problem. Interesting.
  

# open files
paragraph = []
keep = True
for line in fileIn:
 if line.isspace(): # end of paragraph 



Aha! finding the blank line

  

   if keep:
 outFile.writelines(paragraph)
   paragraph = []



This is what I called re-initialising the list.

  

   keep = True
 else:
   if keep:
 if line == '\tfoo':
   keep = False
 else:
   paragraph.append(line)
# anticipating last line not blank, write last paragraph
if keep:
  outFile.writelines(paragraph)

# use shutil to rename



Thank you.

  

--
Bob Gailer
Chapel Hill NC
919-636-4239



  



--
Bob Gailer
Chapel Hill NC
919-636-4239
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Introduction - log exercise

2009-11-17 Thread Dave Angel



Antonio de la Fuente wrote:

* Dave Angel  [2009-11-17 16:30:43 -0500]:

  



for line in fileIn:  
if line.isspace():   
print "* blank line "
print myList 
	print "* fancy blank line "
myList =]  
else:
myList.append(line)   


I think is what i expect, but confuse me that is in this format:

['Tue Nov 17 16:11:47 GMT 2009\n'], '\tbladi bladi bla', '\ttarila ri la\n', 
'\tpatatin pataton\n', '\ttatati tatata\n', '\tfoo\n']
* fancy blank line 
* blank line 

with linefeeds and tabs all over, I see why everybody calls it
paragraph.
Once I write to a file from the list, it will comeback the initial
format of the file?

  
No, when you want to write it to the file, you'll need to loop through 
the list.  There is a shortcut, however.  If you simply do:

   stdout.write(  "".join(myList)  )

it will join all the strings in the list together into one big string 
and then send it to stdout.  You can do that because each line still has 
its newline at the end.


BTW, the list you quoted above is malformed.  You really need to 
copy&paste code and data into the message, so information isn't lost.  
You have an extra right bracket and a missing \n in there.




Next version of the code: whenever you have a non-blank line, in
addition to adding it to the list, also check it for whether it's
equal-foo.
If so, set a flag.  When printing the outlist, skip the printing if
the flag is set.  Remember that you'll have to clear this flag each
time you blank
the mylist, both before the loop, and in the middle of the loop.



I am a bit lost with the flags, is it what Bob Gailer was calling keep True, 
keep = False, right?

  
You can certainly call it 'keep'   The point is, it'll tell you whether 
to output a particular paragraph or not.

Once this makes sense, you can worry about actually writing the
output to a real file, maybe compressing it, maybe doing deletes and
renames
as appropriate.   You probably don't need shutil module,  os module
probably has enough functions for this.

At any of these stages, if you get stuck, call for help.  But your
code will be only as complex as that stage needs, so we can find one
bug at a time.

DaveA



Thank you, it has been very helpful.

  

You're very welcome.

DaveA
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] proxy switcher - was Re: I love python / you guys :)

2009-11-17 Thread bibi midi
Hi guys!

Thank you all for the brainstorming. As much as i love to follow all your
tips but sorry I got lost in the sea of discussion :-) Therefore i thought i
start to roll out my own and i hope you guys can chime in.

The print lines in my code are just for debugging. Of course they can be
omitted but for a newbie like me I helps to see what I'm doing is what I'm
expected to do.

I would like to improve my line search with the module re but cant find
something to get me started. I appreciate if you can show me how. The
'proxy' search works but I'm sure with re module the search will be definite
i mean wont yield 'false positive' result, sort-of.

http://pastebin.ca/1675865


-- 
Best Regards,
bibimidi

Sent from Dhahran, 04, Saudi Arabia
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor