[Tutor] Compute data usage from log
Hi all! I have a file with data structure derived from wvdial log: Oct 14 11:03:45 cc02695 pppd[3092]: Sent 3489538 bytes, received 43317854 bytes. I want to get the 10th field of each line and get the sum for all lines (total my net data usage). In awk u can easily pop it using field variables e..g $10. Since im learning python on my own i thought this is a good way to learn but im stumped how to achieve this goal. I used open() and read line method to store the contents in a list but it is a list of string. I need it to be numbers. Iterating through the string list definitely is unwieldy. Can you guys advise best way how? FWIW im a self learner and just relying from tutorial materials and this medium to learn python. Lead me to the right direction please :-) My IDE is vim and ipython in debian linux. -- Best Regards, bibimidi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Compute data usage from log
bibi midi wrote: Hi all! I have a file with data structure derived from wvdial log: Oct 14 11:03:45 cc02695 pppd[3092]: Sent 3489538 bytes, received 43317854 bytes. I want to get the 10th field of each line and get the sum for all lines (total my net data usage). In awk u can easily pop it using field variables e..g $10. Since im learning python on my own i thought this is a good way to learn but im stumped how to achieve this goal. I used open() and read line method to store the contents in a list but it is a list of string. I need it to be numbers. Iterating through the string list definitely is unwieldy. Can you guys advise best way how? FWIW im a self learner and just relying from tutorial materials and this medium to learn python. Lead me to the right direction please :-) My IDE is vim and ipython in debian linux. -- Best Regards, bibimidi ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor fInput = open('/path/to/log.file', 'rb') total_usage = 0 for line in fInput: total_usage += int(line.split(' ')[9].strip()) print total_usage That would be the simple no frills version. What it does is iterate through the file, on each line it splits it into columns delimited by spaces, takes the 10th element (you count from 0 up and not 1 up), converts it into an integer and adds it to your total_usage counter. Of course this has no error checking and or niceties, but I will leave that up to you. Hope that helps. -- Kind Regards, Christian Witts ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] A question about the self and other stuff
"Khalid Al-Ghamdi" wrote class Robot: population = 0 def __init__(self, name): self.name=name print ('initializing {0}'.format(self.name)) Robot.population+=1 def __del__(self): '''I'm dying''' print ('{0} is being destroyed!'.format(self.name)) Robot.population-=1 droid1 = Robot('D23') 1- In the class definition above, I don't get the "self.name=name". I understand that the self is supposed to refer to the actual object, but since it is an initialization method, there is no way to enter a name in place of the arguments. When you instantiate a class you create a new object and initialise it. The way you do that in code is object = ClassName(arguments) What then happens is that the ClassName object is created and its __init__ method is called with arguments passed to it. So in your case when you do droid1 = Robot("D23") you are NOT assigning the class to droid1 you are creating a new instance of Robot and passing the argument "D23" to the init method as its name parameter. The __init__ method then assigns the name to self.name, ie to droid1.name The __xxx___ methods are all special methods that Python calls indirectly. __del__ is called when an object is deleted so is the complement of __init__. __eq__ is called when we do an equality test: if drioid1 == droid2 for example will actually call doid1.__eq__(droid2) These methods allow us to change how operatrors work for our classes. You might find it useful to read the OOP topic in my tutorial as an secondary source to the book/course you are currently reading. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Compute data usage from log
On Mon, Oct 26, 2009 at 3:20 AM, Christian Witts wrote: > fInput = open('/path/to/log.file', 'rb') > total_usage = 0 > for line in fInput: > total_usage += int(line.split(' ')[9].strip()) > print total_usage > It's actually bad to assign a variable to the file object in this case (flinput = ) because Python will automatically close a file after you're done with it if you iterate over it directly, but if you include a reference it will stay open until the python program ends or you explicitly call flinput.close(). It doesn't matter much in this example but in general it is good practice to either 1) call foo.close() immediately after you're done using a file object, or 2) don't alias the file object and just over it directly so Python will auto-close it. Therefore a better (and simpler) way to do the above would be: total_usage = 0 for line in open('/path/to/log.file'): total_usage += int(line.split(' ')[9]) Also note you don't need to strip the input because int() coersion ignores whitespace anyway. And additionally you shouldn't be opening this in binary mode unless you're sure you want to, and I'm guessing the log file is ascii so there's no need for the 'rb'. (reading is default so we don't specify an 'r'.) And since I like list comprehensions a lot, I'd probably do it like this instead: total_usage = sum([int(line.split(' ')[9]) for line in open('/path/to/log.file')]) Which incidentally is even shorter, but may be less readable if you don't use list comprehensions often. Also, the list comprehension version is likely to be more efficient, both because of the use of sum rather than repeated addition (sum is implemented in C) and because list comprehensions in general are a tad faster than explicit iteration, if i recall correctly (don't hold me to that though, I may be wrong.) > > Of course this has no error checking and or niceties, but I will leave that > up to you. The same applies to my modifications. Good luck, and let us know if you need anything else! -Luke ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to use lxml and win32com?
Alan Gauld wrote: > > > "elca" wrote > >> i want to use IE.navigate function with beautifulsoup or lxml.. >> if anyone know about this or sample. > > the parsers both effectively replace the browser so you > can't really use any functions of IE to modify soup or lxml. > > Why do you want to use navigate()? What are you trying to do? > There is likely to be another way to do it from Python. > > HTH, > > > -- > Alan Gauld > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > > Hello, actually im making web scraper. and scraping is no problem with javascript. after made scraper, i will add some other function and that time i will encounter many javascript, so why i try to use PAMIE or IE http://elca.pastebin.com/m52e7d8e0 i was attached current scraper script source. especially i want to change 'thepage = urllib.urlopen(theurl).read()' to PAMIE method. if possible ,you can check it and correct me? thanks in advance.. -- View this message in context: http://www.nabble.com/how-to-use-lxml-and-win32com--tp26045028p26053583.html Sent from the Python - tutor mailing list archive at Nabble.com. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Reading information from a text file into a list of lists (WinXP/py2.6.2)
On Sun, Oct 25, 2009 at 6:53 PM, Katt wrote: > Hello all, > > Currently I am working on a program that reads text from a text file. I > would like it to place the information int a list and inside the information > would have sublists of information. > > The text file looks like this: > > "Old Test","2009_10_20" > "Current Test","2009_10_25" > "Future Test","2009_11_01" > > I am trying to get the list of lists to look like the following after the > information is read from the text file: > > important_dates = [["Old Test","2009_10_20"],["Current > Test",2009_10_25"],["Future Test","2009_11_01"]] Take a look at the csv module in the standard lib. You can configure it to use comma as the delimiter and it will handle the quotes for you. Kent ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Compute data usage from log
On Mon, Oct 26, 2009 at 2:12 PM, Luke Paireepinart wrote: > > > On Mon, Oct 26, 2009 at 3:20 AM, Christian Witts > wrote: > >> fInput = open('/path/to/log.file', 'rb') >> total_usage = 0 >> for line in fInput: >> total_usage += int(line.split(' ')[9].strip()) >> print total_usage >> > > It's actually bad to assign a variable to the file object in this case > (flinput = ) because Python will automatically close a file after you're > done with it if you iterate over it directly, but if you include a reference > it will stay open until the python program ends or you explicitly call > flinput.close(). It doesn't matter much in this example but in general it > is good practice to either > 1) call foo.close() immediately after you're done using a file object, or > 2) don't alias the file object and just over it directly so Python will > auto-close it. > > Therefore a better (and simpler) way to do the above would be: > > total_usage = 0 > for line in open('/path/to/log.file'): > total_usage += int(line.split(' ')[9]) > Hi Luke, Your modification seems cleaner, you called the open function directly in the for loop. > > Also note you don't need to strip the input because int() coersion ignores > whitespace anyway. And additionally you shouldn't be opening this in binary > mode unless you're sure you want to, and I'm guessing the log file is ascii > so there's no need for the 'rb'. (reading is default so we don't specify an > 'r'.) > The file is normal ascii text. Ii open it with no mode set, and that defaults to read-only. > > And since I like list comprehensions a lot, I'd probably do it like this > instead: > > total_usage = sum([int(line.split(' ')[9]) for line in > open('/path/to/log.file')]) > > Which incidentally is even shorter, but may be less readable if you don't > use list comprehensions often. > > Also, the list comprehension version is likely to be more efficient, both > because of the use of sum rather than repeated addition (sum is implemented > in C) and because list comprehensions in general are a tad faster than > explicit iteration, if i recall correctly (don't hold me to that though, I > may be wrong.) > I have read up on list comprehension and I seem to understand it's basics. I will play around with the different solutions on hand. > >> Of course this has no error checking and or niceties, but I will leave >> that up to you. > > The same applies to my modifications. > > Good luck, and let us know if you need anything else! > > -Luke > Thank you as always :-) -- Best Regards, bibimidi Sent from Riyadh, 01, Saudi Arabia ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to use lxml and win32com?
"elca" wrote i want to use IE.navigate function with beautifulsoup or lxml.. if anyone know about this or sample. Why do you want to use navigate()? What are you trying to do? There is likely to be another way to do it from Python. so why i try to use PAMIE or IE http://elca.pastebin.com/m52e7d8e0 i was attached current scraper script source. OK TherR are several problems in there. First, are you sure you want to define the function getit() inside a while loop? (I'm pretty sure you don't) And are you sure you want the function to recurse infinitely - see the last line (I'm pretty sure you don't) Next, do you really want page_check() to sleep for 13 seconds? (Once per letter in the url) especially i want to change 'thepage = urllib.urlopen(theurl).read()' to PAMIE method. And you still don't explain why you don;t want to use urlopen? What advantage does using PAMIE offer? I'd expect it to be slower and more memory hungry (since it uses IE under the covers). HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Function design
Hello, I need your help in designing a function. I have created this script to check out if certain column in some csv files has value "0": import csv def zerofound(csvfile, outputfile, lastcolumn ): """Finds columns with zero prices. Column have 0 index""" final = csv.writer(open(outputfile, 'wb'), dialect='excel') reader = csv.reader(open(csvfile, 'rb'), dialect='excel') for row in reader: if '0' in row[:lastcolumn]: final.writerow(row) if __name__ == '__main__': zerofound('pricesfrombv.csv', 'noprices_in_BV.csv', 4) zerofound('excelbv.csv', 'noprices_in_ExcelBV.csv', 6) My question is. Is it OK to create functions with no "returns"? Basically what I did resembles a subroutine, right? How could I redesign this to use "return"? Thanks for your input, Eduardo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Function design
On Mon, Oct 26, 2009 at 2:08 PM, Eduardo Vieira wrote: > Hello, I need your help in designing a function. I have created this > script to check out if certain column in some csv files has value "0": > > import csv > > def zerofound(csvfile, outputfile, lastcolumn ): >"""Finds columns with zero prices. Column have 0 index""" >final = csv.writer(open(outputfile, 'wb'), dialect='excel') > >reader = csv.reader(open(csvfile, 'rb'), dialect='excel') >for row in reader: >if '0' in row[:lastcolumn]: >final.writerow(row) > > > if __name__ == '__main__': >zerofound('pricesfrombv.csv', 'noprices_in_BV.csv', 4) >zerofound('excelbv.csv', 'noprices_in_ExcelBV.csv', 6) > > My question is. Is it OK to create functions with no "returns"? > Basically what I did resembles a subroutine, right? How could I > redesign this to use "return"? > Yes, the general "rule of thumb" in Computer Science is that you try to have functions that either have side-effects or return values, but not both. In practice people sometimes have both (eg. try to do some side-effects, if they work return a 0 and if they don't return an error code) but try to avoid it if possible. For example, string.strip() has a return value but no side-effects, because it does not modify the original string, it returns a new copy of the string that is stripped. Whereas list.sort() has no return value (or it returns None, depending on how you look at it) because it modifies the _ORIGINAL LIST_ which is a side-effect. As for your function, try to define the side-effects and change them into a return value. An example of a side-effect is printing to the screen or writing to a file, not just modifying global scoped or input variables. *SPOILER* Your side-effect is that you're writing rows to "final". So you should get rid of all your "final" code and just return the list of rows. I would also suggest renaming the function to find_zeros() and obviously you would pass just the input filename. Then you would have another function write_rows(outfile, rows) and it would output the rows to outfile. I feel that would be a much cleaner design. write_rows probably won't have a return value since you're writing to a function (side-effect!) * Hope that helps! -Luke > > Thanks for your input, > > Eduardo > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] HttpResponse error
SyntaxError at / ("'return' outside function", ('c:\\Users\\Vincent\\Documents\\django_bookmarks\\..\\django_bookmarks\\boo kmarks\\views.py', 15, None, 'return HttpResponse(output)\n')) As you can tell I am very new to this I am realizing that it is very important that indention and syntax is very important but i don't understand why I am getting this error This is the the original script in the views.py from django.http import HttpResponse def main_page(request) : output = ''' %s %s%s ''' % ( 'Django Bookmarks', 'Welcome to Django Bookmarks', 'Where you can store and share bookmarks!' ) return HttpResponse(output) And this (below is the error) Environment: Request Method: GET Request URL: http://127.0.0.1:8000/ Django Version: 1.1.1 Python Version: 2.6.1 Installed Applications: ['django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.sites'] Installed Middleware: ('django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware') Traceback: File "C:\Python26\Lib\site-packages\django\core\handlers\base.py" in get_response 83. request.path_info) File "C:\Python26\Lib\site-packages\django\core\urlresolvers.py" in resolve 216. for pattern in self.url_patterns: File "C:\Python26\Lib\site-packages\django\core\urlresolvers.py" in _get_url_patterns 245. patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module) File "C:\Python26\Lib\site-packages\django\core\urlresolvers.py" in _get_urlconf_module 240. self._urlconf_module = import_module(self.urlconf_name) File "C:\Python26\Lib\site-packages\django\utils\importlib.py" in import_module 35. __import__(name) File "C:\Users\Vincent\Documents\django_bookmarks\..\django_bookmarks\urls.py" in 2. from bookmarks.views import * Exception Type: SyntaxError at / Exception Value: ("'return' outside function", ('c:\\Users\\Vincent\\Documents\\django_bookmarks\\..\\django_bookmarks\\boo kmarks\\views.py', 15, None, 'return HttpResponse(output)\n')) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] HttpResponse error
> Exception Type: SyntaxError at / > > Exception Value: ("'return' outside function", > ('c:\\Users\\Vincent\\Documents\\django_bookmarks\\..\\django_bookmarks\\bookmarks\\views.py', > 15, None, 'return HttpResponse(output)\n')) > The Error message indicates that you've diagnosed the problem correctly: It appears that the "return" statement is flushed left rather than being indented. Try indenting it so and see if that works. The reason it does not work is explained here: http://docs.python.org/reference/simple_stmts.html#the-return-statement "return may only occur syntactically nested in a function definition, not within a nested class definition." Also, while this is indeed a generic Python syntax issue, the django forums are typically the best place for help with...well...Django. You can check those out at: http://groups-beta.google.com/group/django-users?pli=1 Good luck! Serdar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to use lxml and win32com?
Alan Gauld wrote: > > > "elca" wrote > i want to use IE.navigate function with beautifulsoup or lxml.. if anyone know about this or sample. >>> Why do you want to use navigate()? What are you trying to do? >>> There is likely to be another way to do it from Python. > >> so why i try to use PAMIE or IE >> http://elca.pastebin.com/m52e7d8e0 >> i was attached current scraper script source. > > OK TherR are several problems in there. > First, are you sure you want to define the function getit() inside a > while loop? (I'm pretty sure you don't) And are you sure you want > the function to recurse infinitely - see the last line (I'm pretty sure > you > don't) > Next, do you really want page_check() to sleep for 13 seconds? > (Once per letter in the url) > --> > all your words is correct. i don't need such like getit() and other > function . >> especially i want to change 'thepage = urllib.urlopen(theurl).read()' to >> PAMIE method. > > And you still don't explain why you don;t want to use urlopen? > What advantage does using PAMIE offer? I'd expect it to be slower > and more memory hungry (since it uses IE under the covers). > --> after make scraper i will add some other function that time i need to > handle javascript, > but as you already know urlopen method don't have such can handling > javasript option, > so why i want to use pamie, in addition i was tried other kind of method, > such like Selenium,webdriver > but not so much good for me, thanks for your help > HTH, > > > -- > Alan Gauld > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > > -- View this message in context: http://www.nabble.com/how-to-use-lxml-and-win32com--tp26045028p26068522.html Sent from the Python - tutor mailing list archive at Nabble.com. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] HttpResponse error
Vincent Jones wrote: SyntaxError at / ("'return' outside function", ('c:\\Users\\Vincent\\Documents\\django_bookmarks\\..\\django_bookmarks\\boo kmarks\\views.py', 15, None, 'return HttpResponse(output)\n')) from django.http import HttpResponse def main_page(request) : output = ''' %s %s%s ''' % ( 'Django Bookmarks', 'Welcome to Django Bookmarks', 'Where you can store and share bookmarks!' ) return HttpResponse(output) The return line needs to be indented the same as the other line(s) in the function definition, which is to say it has to line up with the output= line. There really are only two lines in the body of the function, and they need to start at the same column. (It'd be easier on all of us if the code in your message weren't doublespaced, as well. That might be a email setting. Try using plain text, and see if it works better.) DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Function design
Luke Paireepinart wrote: On Mon, Oct 26, 2009 at 2:08 PM, Eduardo Vieira wrote: Hello, I need your help in designing a function. I have created this script to check out if certain column in some csv files has value "0": import csv def zerofound(csvfile, outputfile, lastcolumn ): """Finds columns with zero prices. Column have 0 index""" final = csv.writer(open(outputfile, 'wb'), dialect='excel') reader = csv.reader(open(csvfile, 'rb'), dialect='excel') for row in reader: if '0' in row[:lastcolumn]: final.writerow(row) if __name__ == '__main__': zerofound('pricesfrombv.csv', 'noprices_in_BV.csv', 4) zerofound('excelbv.csv', 'noprices_in_ExcelBV.csv', 6) My question is. Is it OK to create functions with no "returns"? Basically what I did resembles a subroutine, right? How could I redesign this to use "return"? Yes, the general "rule of thumb" in Computer Science is that you try to have functions that either have side-effects or return values, but not both. In practice people sometimes have both (eg. try to do some side-effects, if they work return a 0 and if they don't return an error code) but try to avoid it if possible. For example, string.strip() has a return value but no side-effects, because it does not modify the original string, it returns a new copy of the string that is stripped. Whereas list.sort() has no return value (or it returns None, depending on how you look at it) because it modifies the _ORIGINAL LIST_ which is a side-effect. As for your function, try to define the side-effects and change them into a return value. An example of a side-effect is printing to the screen or writing to a file, not just modifying global scoped or input variables. *SPOILER* Your side-effect is that you're writing rows to "final". So you should get rid of all your "final" code and just return the list of rows. I would also suggest renaming the function to find_zeros() and obviously you would pass just the input filename. Then you would have another function write_rows(outfile, rows) and it would output the rows to outfile. I feel that would be a much cleaner design. write_rows probably won't have a return value since you're writing to a function (side-effect!) * Hope that helps! -Luke Thanks for your input, Eduardo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor I agree with Luke's comments. But I'd like to point out an apparent bug (I haven't tried the code, this is just by inspection). You use the test if '0' in row[.] that's not going to check for a zero value, it's going to check for a zero digit character somewhere in the value. So 504 would also be recognized, as well as 540. Normally, if it's a numeric int field, you want to convert it to int, and check its value, rather than do string manipulations directly, since a value 05 might really be intended to be the same as 5. DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Function design
"Eduardo Vieira" wrote def zerofound(csvfile, outputfile, lastcolumn ): """Finds columns with zero prices. Column have 0 index""" final = csv.writer(open(outputfile, 'wb'), dialect='excel') reader = csv.reader(open(csvfile, 'rb'), dialect='excel') for row in reader: if '0' in row[:lastcolumn]: final.writerow(row) My question is. Is it OK to create functions with no "returns"? Yes, its what some other languages call a procedure. But in this case its probably not the best route. redesign this to use "return"? You could return the number of rows found, that way the user could check for non zero to see if its worthwhile even looking in the output file. Also I think the name of the function could be changed since it actually creates a file of rows with zero. So why not call it write_zero_rows() or somesuch then the row count as a return value would be even more natural. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor