[Tutor] problem with creating paths
I am a novice (at programming). I use MAC OS 10.13.6 Anaconda. Python 3.5.4 Spyder 3.5.6 I have just re-written a moderately complex program (a Class) on the advice of Alan and Steven. The rewriting proved to be very useful. The working program uses instances of the Class with User chosen parameters. The output data seems correct to me. So I then began redoing all the tests. The first Methods tested gave OK. But I have just started testing a new Method and I get a Universal error in my tests. It says that the output file is already present. These files (paths) are correctly deleted by the 'teardown' Method, when only the earlier portion of the program is tested. But are not deleted with the last method tested. After searching I have found this unexpected output illustrated in the copy-paste below. test The type of the paths is: The values of the paths are : ( [ '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/', '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D', '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/A_POCI_Input_Data', . . . . '/Users/sydney/AnacondaProjects/reproduction/Current_Version /Results/20181017D/B_Cycle_Zero/Text_Files', . . . '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/C_Final_Results/Plots/Population_Data/Ratios'], '/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files') There are two items that are 'wrong' in this output. 1. The property 'paths' is defined in the program as a list and the items are added using paths.append(), yet the test says that when tested it is a tuple. 2. The tuple arises by the addition of the last entry in the file, AFTER the closing bracket of the list which is the first item in the tuple. When I test the length of 'paths' I get a value of 2! I apologise for the lengthy explanation, but I am at a loss. I have looked for an error that might have added an item as a + and I find nothing. The character of the final item is also puzzling to me. I would much appreciate any guidance as to how I should search for the fault or error. Sydney _ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
On Wed, Oct 17, 2018 at 03:18:00PM +0100, Shall, Sydney via Tutor wrote: [...] > After searching I have found this unexpected output illustrated in the > copy-paste below. > > > test > > > The type of the paths is: Believe Python when it tells you something is a tuple. Trust me, it knows! > The values of the paths are : > > > ([ > '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/', > '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D', > '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/A_POCI_Input_Data', [...] > > '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/B_Cycle_Zero/Text_Files', [...] > '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/C_Final_Results/Plots/Population_Data/Ratios'], > '/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files') Here is a good trick for debugging. Start by simplifying your data, to make it easier to see the needle in the haystack. Long paths like the above present you with a wall of text, so (temporarily) replace each path with a single character to cut down on the visual complexity. That makes it easier to see what is going on. Replacing each path with a single letter gives me: (['a', 'b', 'c', [...] 'x', 'y'], 'z') Notice that your value is a tuple of two items: Item One is a list, ['a', 'b', 'c', [...] 'x', 'y'] Item Two is a string, 'z' > There are two items that are 'wrong' in this output. > > 1. The property 'paths' is defined in the program as a list and the > items are added using paths.append(), yet the test says that when tested > it is a tuple. > 2. The tuple arises by the addition of the last entry in the file, AFTER > the closing bracket of the list which is the first item in the tuple. > > When I test the length of 'paths' I get a value of 2! That's because it is a tuple of two items. > I apologise for the lengthy explanation, but I am at a loss. > > I have looked for an error that might have added an item as a + and I > find nothing. Without seeing your code, there's no way of telling how you constructed this value. You intended a list, and built a list (somehow!), but then you did *something* to replace it with a tuple. Perhaps you did: paths = [] for some_path in something_or_rather: paths.append(some_path) then later on: paths = (paths, another_path) but there's a million ways you could have got the same result. And of course you could have used any variable name... I'm assuming it is called "paths", but you should substitute whatever name (or names!) you actually used. > I would much appreciate any guidance as to how I should search for the > fault or error. Start by looking for any line of code that starts with: paths = and see if and where you replaced the list with a tuple. If that gets you nowhere, start looking for *every* reference to "paths" and see what they do. If *that* gets you nowhere, start adding debugging code to your program. Put assertions like this: assert isinstance(paths, list) in various parts of the code, then run the program and see where it fails. That tells you that *at that point* paths is no longer a list. E.g. something like this: # build the paths... paths = [] for blah blah blah blah: paths.append(whatever) assert isinstance(paths, list) # line 50 (say) do_this() do_that() assert isinstance(paths, list) # line 53 do_something_else() and_another_thing() assert isinstance(paths, list) # line 56 If the first two assertions on line 50 and 53 pass, but the third at line 56 fails, you know that the bug is introduced somewhere between line 53 and 56. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] problem with creating paths
I can now add to my previous email the following observation. If I do not delete the output file and redo the test I get the following as the 'extra' entry in paths: '/Users/sydney/AnacondaProjects/capital_reproduction/Current_Version/Results/20181017D/B_Cycle_Zero/Text_Files') If however, I delete the output file and then redo the test I get the following as the 'extra' entry in paths: '/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files') This seems to be consistent. The upper incorrect entry is item path19 in the list part of paths. I have studied every example of 'path19' in the program and I cannot find an explanation. help! Sydney _ Professor Sydney Shall Department of Haematology/Oncology Phone: +(0)2078489200 E-Mail: sydney.shall [Correspondents outside the College should add @kcl.ac.uk] ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
Shall, Sydney via Tutor wrote: > There are two items that are 'wrong' in this output. > > 1. The property 'paths' is defined in the program as a list and the > items are added using paths.append(), yet the test says that when tested > it is a tuple. >>> paths = ["foo", "bar"], >>> paths += "baz", # ("baz",) leads to the same result >>> paths (['foo', 'bar'], 'baz') This might happen if you have accidental trailing commas in two places, but that seems rather unlikely. Maybe you have a function with a star def f(*args): ... # args is a tuple, even if you pass one argument or no arguments # to f(). ? > 2. The tuple arises by the addition of the last entry in the file, AFTER > the closing bracket of the list which is the first item in the tuple. Sorry, I don't understand that sentence. > When I test the length of 'paths' I get a value of 2! That's because you have a tuple with two entries, one list and one string. > I apologise for the lengthy explanation, but I am at a loss. > > I have looked for an error that might have added an item as a + and I > find nothing. > The character of the final item is also puzzling to me. > > I would much appreciate any guidance as to how I should search for the > fault or error. Look at the assignments. If you find the pattern I mentioned above remove the commas. If you don't see anything suspicious make a test script containing only the manipulation of the paths variable. Make sure it replicates the behaviour of the bigger script; then show it to us. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
Firstly, I would like to thank Steven for reminding me of the assert statement. I should have remembered this. It allowed me to isolate the problem, which predictably (for me) was very elementary. I am too embarrassed to say how simple the error was. However, my original problem was not solved by correcting this error. So, I will now try and narrow down the location of the problem and then if I cannot solve it, I shall return for more good advice. Many thanks to Steven and to Peter. Sydney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote: > Firstly, I would like to thank Steven for reminding me of the assert > statement. I should have remembered this. It allowed me to isolate the > problem, which predictably (for me) was very elementary. I am too > embarrassed to say how simple the error was. > > However, my original problem was not solved by correcting this error. > > So, I will now try and narrow down the location of the problem and then > if I cannot solve it, I shall return for more good advice. > > Many thanks to Steven and to Peter. I'll weigh in with a mini- (and unasked-for-) lecture here: this is often the point at which someone says "boy, I wish Python were strongly typed, so things didn't change types in flight". But in fact, the list didn't change types, it's still a list. In fact we even know where that list is: it's the first element of that tuple you ended up with. The _name_ you gave to that list carries no typing meaning, however (although you can give it type hints that an external tool could use to warn you that you are changing something). So you have somewhere given that name to a completely different object, a tuple which contains your list and another element. So clearly what you're looking for is the place that happens. So here's a sketch of how you might use type hinting to find this, to bring it back to something practical: === types.py: from typing import List, Tuple a: List[str] = [ '/a/path', '/b/path', ] print(type(a)) print(a) a = (a, '/c/path') print(type(a)) print(a) === this works just fine: $ python3 types.py ['/a/path', '/b/path'] (['/a/path', '/b/path'], '/c/path') === but a hinting tool can see a possible issue: $ mypy types.py types.py:10: error: Incompatible types in assignment (expression has type "Tuple[List[str], str]", variable has type "List[str]") ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
On 17/10/2018 18:18, Mats Wichmann wrote: On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote: Firstly, I would like to thank Steven for reminding me of the assert statement. I should have remembered this. It allowed me to isolate the problem, which predictably (for me) was very elementary. I am too embarrassed to say how simple the error was. However, my original problem was not solved by correcting this error. So, I will now try and narrow down the location of the problem and then if I cannot solve it, I shall return for more good advice. Many thanks to Steven and to Peter. I'll weigh in with a mini- (and unasked-for-) lecture here: this is often the point at which someone says "boy, I wish Python were strongly typed, so things didn't change types in flight". But in fact, the list didn't change types, it's still a list. In fact we even know where that list is: it's the first element of that tuple you ended up with. The _name_ you gave to that list carries no typing meaning, however (although you can give it type hints that an external tool could use to warn you that you are changing something). So you have somewhere given that name to a completely different object, a tuple which contains your list and another element. So clearly what you're looking for is the place that happens. So here's a sketch of how you might use type hinting to find this, to bring it back to something practical: === types.py: from typing import List, Tuple a: List[str] = [ '/a/path', '/b/path', ] print(type(a)) print(a) a = (a, '/c/path') print(type(a)) print(a) === this works just fine: $ python3 types.py ['/a/path', '/b/path'] (['/a/path', '/b/path'], '/c/path') === but a hinting tool can see a possible issue: $ mypy types.py types.py:10: error: Incompatible types in assignment (expression has type "Tuple[List[str], str]", variable has type "List[str]") Thanks for this. Most helpful. Sydney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&data=01%7C01%7Csydney.shall%40kcl.ac.uk%7Ca58896d72c374cda3c1508d63454b783%7C8370cf1416f34c16b83c724071654356%7C0&sdata=OHZtedYdy0UHKDagLO1TI%2BUIjEJuzRjZjD4HRdSQmNI%3D&reserved=0 -- _ Professor Sydney Shall Department of Haematology/Oncology Phone: +(0)2078489200 E-Mail: sydney.shall [Correspondents outside the College should add @kcl.ac.uk] ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] problem with creating paths
On 17/10/18 18:18, Mats Wichmann wrote: On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote: Firstly, I would like to thank Steven for reminding me of the assert statement. I should have remembered this. It allowed me to isolate the problem, which predictably (for me) was very elementary. I am too embarrassed to say how simple the error was. However, my original problem was not solved by correcting this error. So, I will now try and narrow down the location of the problem and then if I cannot solve it, I shall return for more good advice. Many thanks to Steven and to Peter. I'll weigh in with a mini- (and unasked-for-) lecture here: this is often the point at which someone says "boy, I wish Python were strongly typed, so things didn't change types in flight". This is where I say Python *IS* strongly, although dynamically typed. Why do people have such a problem with this? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Performance Issue
I have written a screen scraping program that watches a clock (on the app's server) and at 7:00:00 AM dashes to make a reservation on line. It works fine. However, i have spent time trying to improve its performance. I am using selenium, with chrome driver. Here is what i have learned. I have tried various methods to find (by link_text, by_xpath, etc.) and click on the element in question (shown below). When i find the element with no click, the find process takes about .02 seconds. When i find it with a click (i need to select the element and move to the next iframe) it takes over a second. I get these same results no matter which find_element_by variation i use and i get the same times in headless or normal mode. Here is my theory - finding the element is relatively simple in the html already loaded into my machine - hence .02 seconds. However, when i click on the element, processing goes out to the server which does some stuff and i get a new iframe displayed, all of which takes time. So i have sort of concluded that perhaps I can't take a big chunk of that time out (literally the same statement without the click option takes 2% of the time), but am hoping perhaps someone has another thought. I had thought maybe i could jump to the second ifame locally, but can't see a way to do this. I also have considered something other than selenium, but since i think the problem lies on the server side, not sure it is worth the time. Thanks in advance for any ideas. The program is quite large, but here is the relevant section: # Back from NAP - REALLY CLOSE to 7:00 AM select the date desired and go to the next page (really iframe) - using prepared xpath try: br.find_element_by_link_text(str(day_to_book)).click() #sleep(refresh_factor) except NoSuchElementException: self.queue.put("- (" + thread + ") Attempted date selection too early? " + str(datetime.datetime.now()\ + datetime.timedelta(seconds = second_difference))[-11:-4]) Return Here is the relevant html (in this case I have copied the html for the 31st of the month, but all dates look the same which is why find_element_by_link_text [with day_to_book = 31) is easy to use. Again, my code works fine - I am trying to see if there is a way to improve performance with some trick I can't come up with. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Performance Issue
On 17/10/18 22:25, Stephen Smith wrote: > I have written a screen scraping program that watches a clock (on the app's > server) and at 7:00:00 AM dashes to make a reservation on line. It works > fine. However, i have spent time trying to improve its performance. I am > using selenium, with chrome driver. When doing performance tuning the first thing to answer is what does improved performance mean. For example in a Word Processor improving the speed that an input character appears on screen by 10% is unlikely to be a worthwhile exercise. But improving the time taken to do a global search/replace by 10% might well be worthwhile. So what do you want to improve about an app that spends most of its time waiting for a change on a remote server (presumably by polling?) Is it the speed/frequency of polling? The speed of reading the response? The speed of processing the response? And knowing what you want to improve have you measured it to see where the time is being spent? Is it in the client request? The transmission to the server? the server processing? the transmission from the server? the reading of that response? or the processing of that response? You need to time each of those phases accurately to find out which bits are worth improving. > Here is what i have learned. I have tried various methods to find (by > link_text, by_xpath, etc.) and click on the element in question (shown > below). When i find the element with no click, the find process takes about > .02 seconds. When i find it with a click (i need to select the element and > move to the next iframe) it takes over a second. I get these same results no > matter which find_element_by variation i use and i get the same times in > headless or normal mode. > > Here is my theory - finding the element is relatively simple in the html > already loaded into my machine - hence .02 seconds. However, when i click on > the element, processing goes out to the server which does some stuff and i > get a new iframe displayed, all of which takes time. Absolutely. network access is likely to be measured in 10ths of a second rather than hundredths. And processing the request may well entail a server database call (which may itself be on a separate machine from the web server with a corresponding LAN message delay), then there's the creation and transmission of the HTML (unless your server provides an API with JSON responses - but then you don't need clicks etc!) And iFrames make that worse since every iframe effectively gets treated as a separate html document. Then when your client receives the data it has to reparse the html into a document structure before performing the search. > concluded that perhaps I can't take a big chunk of that time out You probably can, but only if you have access to the server code and the network infrastructure and deep enough pockets for a server upgrade or a new proxy server. Assuming that's not the case then no, you need to look at other options. But your first step has to be to measure the various stages of the request. If the problem lies in the transmission time across the network there is probably not much you can do. If its in the database access (trickier to measure if you don't have the server code - you need to create some simultaneous equations using multiple test scenarios) then you might be able to construct better queries (eg look at a different page or only query the target iframe). > considered something other than selenium, but since i think the problem lies > on the server side, not sure it is worth the time. It depends on the nature of the page. The best solution, by far, is not to do web scraping. Its always the worst case solution and to be avoided if at all possible. Try to find an API with JSON or XML responses. Also, are you sure you need to use the clock on the page? Isn't the server clock adequate? In which case the response time should be in every message header so there's no need for web scraping at all... Finally, I think there is an active Selenium discussion forum so you could try there for more ideas. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor