[Tutor] problem with creating paths

2018-10-17 Thread Shall, Sydney via Tutor

I am a novice (at programming).
I use MAC OS 10.13.6
Anaconda.
Python 3.5.4
Spyder 3.5.6

I have just re-written a moderately complex program (a Class) on the 
advice of Alan and Steven. The rewriting proved to be very useful.


The working program uses instances of the Class with User chosen 
parameters. The output data seems correct to me.


So I then began redoing all the tests. The first Methods tested gave OK.
But I have just started testing a new Method and I get a Universal error 
in my tests. It says that the output file is already present. These 
files (paths) are correctly deleted by the 'teardown' Method, when only 
the earlier portion of the program is tested. But are not deleted with 
the last method tested.


After searching I have found this unexpected output illustrated in the 
copy-paste below.



test


The type of the paths is: 

The values of the paths are :


(   [ 
'/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/',


'/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D',

'/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/A_POCI_Input_Data',
.
.
.
.
   '/Users/sydney/AnacondaProjects/reproduction/Current_Version
/Results/20181017D/B_Cycle_Zero/Text_Files',
.
.
.

'/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/C_Final_Results/Plots/Population_Data/Ratios'],
'/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files')



There are two items that are 'wrong' in this output.

1. The property 'paths' is defined in the program as a list and the 
items are added using paths.append(), yet the test says that when tested 
it is a tuple.
2. The tuple arises by the addition of the last entry in the file, AFTER 
the closing bracket of the list which is the first item in the tuple.


When I test the length of 'paths' I get a value of 2!

I apologise for the lengthy explanation, but I am at a loss.

I have looked for an error that might have added an item as a + and I 
find nothing.

The character of the final item is also puzzling to me.

I would much appreciate any guidance as to how I should search for the 
fault or error.


Sydney






_


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Steven D'Aprano
On Wed, Oct 17, 2018 at 03:18:00PM +0100, Shall, Sydney via Tutor wrote:

[...]
> After searching I have found this unexpected output illustrated in the 
> copy-paste below.
> 
> 
> test
> 
> 
> The type of the paths is: 

Believe Python when it tells you something is a tuple. Trust me, it 
knows!

> The values of the paths are :
> 
> 
> ([
> '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/',
> '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D',
> '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/A_POCI_Input_Data',
[...]
>  
> '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/B_Cycle_Zero/Text_Files',
[...]
> '/Users/sydney/AnacondaProjects/reproduction/Current_Version/Results/20181017D/C_Final_Results/Plots/Population_Data/Ratios'],
> '/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files')

Here is a good trick for debugging. Start by simplifying your data, to 
make it easier to see the needle in the haystack. Long paths like the 
above present you with a wall of text, so (temporarily) replace each 
path with a single character to cut down on the visual complexity. That 
makes it easier to see what is going on.

Replacing each path with a single letter gives me:

(['a', 'b', 'c', [...] 'x', 'y'], 'z')

Notice that your value is a tuple of two items:

Item One is a list, ['a', 'b', 'c', [...] 'x', 'y']
Item Two is a string, 'z'


> There are two items that are 'wrong' in this output.
> 
> 1. The property 'paths' is defined in the program as a list and the 
> items are added using paths.append(), yet the test says that when tested 
> it is a tuple.
> 2. The tuple arises by the addition of the last entry in the file, AFTER 
> the closing bracket of the list which is the first item in the tuple.
> 
> When I test the length of 'paths' I get a value of 2!

That's because it is a tuple of two items.


> I apologise for the lengthy explanation, but I am at a loss.
> 
> I have looked for an error that might have added an item as a + and I 
> find nothing.

Without seeing your code, there's no way of telling how you constructed 
this value. You intended a list, and built a list (somehow!), but then 
you did *something* to replace it with a tuple.

Perhaps you did:

paths = []
for some_path in something_or_rather:
paths.append(some_path)

then later on:

paths = (paths, another_path)


but there's a million ways you could have got the same result. And of 
course you could have used any variable name... I'm assuming it is 
called "paths", but you should substitute whatever name (or names!) you 
actually used.


> I would much appreciate any guidance as to how I should search for the 
> fault or error.

Start by looking for any line of code that starts with:

paths =

and see if and where you replaced the list with a tuple. If that gets 
you nowhere, start looking for *every* reference to "paths" and see what 
they do.

If *that* gets you nowhere, start adding debugging code to your program. 
Put assertions like this:

assert isinstance(paths, list)


in various parts of the code, then run the program and see where it 
fails. That tells you that *at that point* paths is no longer a list. 
E.g. something like this:


# build the paths...
paths = []
for blah blah blah blah:
paths.append(whatever)

assert isinstance(paths, list)  # line 50 (say)
do_this()
do_that()
assert isinstance(paths, list)  # line 53
do_something_else()
and_another_thing()
assert isinstance(paths, list)  # line 56


If the first two assertions on line 50 and 53 pass, but the third at 
line 56 fails, you know that the bug is introduced somewhere between 
line 53 and 56.


-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] problem with creating paths

2018-10-17 Thread Shall, Sydney via Tutor

I can now add to my previous email the following observation.

If I do not delete the output file and redo the test I get the following 
as the 'extra' entry in paths:



'/Users/sydney/AnacondaProjects/capital_reproduction/Current_Version/Results/20181017D/B_Cycle_Zero/Text_Files')

If however, I delete the output file and then redo the test I get the 
following as the 'extra' entry in paths:


 '/Users/sydney/.Trash/20181017D/B_Cycle_Zero/Text_Files')

This seems to be consistent.
The upper incorrect entry is item path19 in the list part of paths. I 
have studied every example of 'path19' in the program and I cannot find 
an explanation.


help!

Sydney


_

Professor Sydney Shall
Department of Haematology/Oncology
Phone: +(0)2078489200
E-Mail: sydney.shall
[Correspondents outside the College should add @kcl.ac.uk]
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Peter Otten
Shall, Sydney via Tutor wrote:

> There are two items that are 'wrong' in this output.
> 
> 1. The property 'paths' is defined in the program as a list and the
> items are added using paths.append(), yet the test says that when tested
> it is a tuple.

>>> paths = ["foo", "bar"],
>>> paths += "baz",  # ("baz",) leads to the same result
>>> paths
(['foo', 'bar'], 'baz')

This might happen if you have accidental trailing commas in two places, but 
that seems rather unlikely. Maybe you have a function with a star

def f(*args):
... # args is a tuple, even if you pass one argument or no arguments
# to f().

?

> 2. The tuple arises by the addition of the last entry in the file, AFTER
> the closing bracket of the list which is the first item in the tuple.

Sorry, I don't understand that sentence.

> When I test the length of 'paths' I get a value of 2!

That's because you have a tuple with two entries, one list and one string.

> I apologise for the lengthy explanation, but I am at a loss.
> 
> I have looked for an error that might have added an item as a + and I
> find nothing.
> The character of the final item is also puzzling to me.
> 
> I would much appreciate any guidance as to how I should search for the
> fault or error.

Look at the assignments. If you find the pattern I mentioned above remove 
the commas. 

If you don't see anything suspicious make a test script containing only the 
manipulation of the paths variable. Make sure it replicates the behaviour of 
the bigger script; then show it to us.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Shall, Sydney via Tutor
Firstly, I would like to thank Steven for reminding me of the assert 
statement. I should have remembered this. It allowed me to isolate the 
problem, which predictably (for me) was very elementary. I am too 
embarrassed to say how simple the error was.


However, my original problem was not solved by correcting this error.

So, I will now try and narrow down the location of the problem and then 
if I cannot solve it, I shall return for more good advice.


Many thanks to Steven and to Peter.

Sydney
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Mats Wichmann
On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote:
> Firstly, I would like to thank Steven for reminding me of the assert
> statement. I should have remembered this. It allowed me to isolate the
> problem, which predictably (for me) was very elementary. I am too
> embarrassed to say how simple the error was.
> 
> However, my original problem was not solved by correcting this error.
> 
> So, I will now try and narrow down the location of the problem and then
> if I cannot solve it, I shall return for more good advice.
> 
> Many thanks to Steven and to Peter.

I'll weigh in with a mini- (and unasked-for-) lecture here:

this is often the point at which someone says "boy, I wish Python were
strongly typed, so things didn't change types in flight".

But in fact, the list didn't change types, it's still a list. In fact we
even know where that list is: it's the first element of that tuple you
ended up with. The _name_ you gave to that list carries no typing
meaning, however (although you can give it type hints that an external
tool could use to warn you that you are changing something).  So you
have somewhere given that name to a completely different object, a tuple
which contains your list and another element. So clearly what you're
looking for is the place that happens.

So here's a sketch of how you might use type hinting to find this, to
bring it back to something practical:


=== types.py:
from typing import List, Tuple

a: List[str] = [
'/a/path',
'/b/path',
]
print(type(a))
print(a)

a = (a, '/c/path')
print(type(a))
print(a)

=== this works just fine:
$ python3 types.py

['/a/path', '/b/path']

(['/a/path', '/b/path'], '/c/path')

=== but a hinting tool can see a possible issue:
$ mypy types.py
types.py:10: error: Incompatible types in assignment (expression has
type "Tuple[List[str], str]", variable has type "List[str]")


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Shall, Sydney via Tutor

On 17/10/2018 18:18, Mats Wichmann wrote:

On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote:

Firstly, I would like to thank Steven for reminding me of the assert
statement. I should have remembered this. It allowed me to isolate the
problem, which predictably (for me) was very elementary. I am too
embarrassed to say how simple the error was.

However, my original problem was not solved by correcting this error.

So, I will now try and narrow down the location of the problem and then
if I cannot solve it, I shall return for more good advice.

Many thanks to Steven and to Peter.


I'll weigh in with a mini- (and unasked-for-) lecture here:

this is often the point at which someone says "boy, I wish Python were
strongly typed, so things didn't change types in flight".

But in fact, the list didn't change types, it's still a list. In fact we
even know where that list is: it's the first element of that tuple you
ended up with. The _name_ you gave to that list carries no typing
meaning, however (although you can give it type hints that an external
tool could use to warn you that you are changing something).  So you
have somewhere given that name to a completely different object, a tuple
which contains your list and another element. So clearly what you're
looking for is the place that happens.

So here's a sketch of how you might use type hinting to find this, to
bring it back to something practical:


=== types.py:
from typing import List, Tuple

a: List[str] = [
 '/a/path',
 '/b/path',
]
print(type(a))
print(a)

a = (a, '/c/path')
print(type(a))
print(a)

=== this works just fine:
$ python3 types.py

['/a/path', '/b/path']

(['/a/path', '/b/path'], '/c/path')

=== but a hinting tool can see a possible issue:
$ mypy types.py
types.py:10: error: Incompatible types in assignment (expression has
type "Tuple[List[str], str]", variable has type "List[str]")



Thanks for this. Most helpful.
Sydney



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&data=01%7C01%7Csydney.shall%40kcl.ac.uk%7Ca58896d72c374cda3c1508d63454b783%7C8370cf1416f34c16b83c724071654356%7C0&sdata=OHZtedYdy0UHKDagLO1TI%2BUIjEJuzRjZjD4HRdSQmNI%3D&reserved=0




--

_

Professor Sydney Shall
Department of Haematology/Oncology
Phone: +(0)2078489200
E-Mail: sydney.shall
[Correspondents outside the College should add @kcl.ac.uk]
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem with creating paths

2018-10-17 Thread Mark Lawrence

On 17/10/18 18:18, Mats Wichmann wrote:

On 10/17/2018 10:07 AM, Shall, Sydney via Tutor wrote:

Firstly, I would like to thank Steven for reminding me of the assert
statement. I should have remembered this. It allowed me to isolate the
problem, which predictably (for me) was very elementary. I am too
embarrassed to say how simple the error was.

However, my original problem was not solved by correcting this error.

So, I will now try and narrow down the location of the problem and then
if I cannot solve it, I shall return for more good advice.

Many thanks to Steven and to Peter.


I'll weigh in with a mini- (and unasked-for-) lecture here:

this is often the point at which someone says "boy, I wish Python were
strongly typed, so things didn't change types in flight".



This is where I say Python *IS* strongly, although dynamically typed. 
Why do people have such a problem with this?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Performance Issue

2018-10-17 Thread Stephen Smith
I have written a screen scraping program that watches a clock (on the app's
server) and at 7:00:00 AM dashes to make a reservation on line. It works
fine. However, i have spent time trying to improve its performance. I am
using selenium, with chrome driver. 

Here is what i have learned. I have tried various methods to find (by
link_text, by_xpath, etc.) and click on the element in question (shown
below). When i find the element with no click, the find process takes about
.02 seconds. When i find it with a click (i need to select the element and
move to the next iframe) it takes over a second. I get these same results no
matter which find_element_by variation i use and i get the same times in
headless or normal mode.

Here is my theory - finding the element is relatively simple in the html
already loaded into my machine - hence .02 seconds. However, when i click on
the element, processing goes out to the server which does some stuff and i
get a new iframe displayed, all of which takes time. So i have sort of
concluded that perhaps I can't take a big chunk of that time out (literally
the same statement without the click option takes 2% of the time), but am
hoping perhaps someone has another thought. I had thought maybe i could jump
to the second ifame locally, but can't see a way to do this. I also have
considered something other than selenium, but since i think the problem lies
on the server side, not sure it is worth the time.

Thanks in advance for any ideas.

The program is quite large, but here is the relevant section:

#   Back from NAP - REALLY CLOSE to 7:00 AM select the date desired and
go to the next page (really iframe) - using prepared xpath

try: 

br.find_element_by_link_text(str(day_to_book)).click()

#sleep(refresh_factor)

except NoSuchElementException:

 self.queue.put("- (" + thread + ") Attempted date selection too
early? " + str(datetime.datetime.now()\

+ datetime.timedelta(seconds =
second_difference))[-11:-4])

 Return

 

Here is the relevant html (in this case I have copied the html for the 31st
of the month, but all dates look the same which is why
find_element_by_link_text [with day_to_book = 31) is easy to use. Again, my
code works fine - I am trying to see if there is a way to improve
performance with some trick I can't come up with.

 



 

 

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Performance Issue

2018-10-17 Thread Alan Gauld via Tutor
On 17/10/18 22:25, Stephen Smith wrote:
> I have written a screen scraping program that watches a clock (on the app's
> server) and at 7:00:00 AM dashes to make a reservation on line. It works
> fine. However, i have spent time trying to improve its performance. I am
> using selenium, with chrome driver. 

When doing performance tuning the first thing to answer
is what does improved performance mean. For example in a Word Processor
improving the speed that an input character appears on screen by 10% is
unlikely to be a worthwhile exercise. But improving the time taken to do
a global search/replace by 10% might well be worthwhile.

So what do you want to improve about an app that spends most
of its time waiting for a change on a remote server (presumably
by polling?) Is it the speed/frequency of polling? The speed of reading
the response? The speed of processing the response?

And knowing what you want to improve have you measured it to
see where the time is being spent? Is it in the client request? The
transmission to the server? the server processing? the transmission
from the server? the reading of that response? or the processing
of that response? You need to time each of those phases accurately
to find out which bits are worth improving.

> Here is what i have learned. I have tried various methods to find (by
> link_text, by_xpath, etc.) and click on the element in question (shown
> below). When i find the element with no click, the find process takes about
> .02 seconds. When i find it with a click (i need to select the element and
> move to the next iframe) it takes over a second. I get these same results no
> matter which find_element_by variation i use and i get the same times in
> headless or normal mode.
> 
> Here is my theory - finding the element is relatively simple in the html
> already loaded into my machine - hence .02 seconds. However, when i click on
> the element, processing goes out to the server which does some stuff and i
> get a new iframe displayed, all of which takes time. 

Absolutely. network access is likely to be measured in 10ths of a
second rather than hundredths. And processing the request may
well entail a server database call (which may itself be on a separate
machine from the web server with a corresponding LAN message delay),
then there's the creation and transmission of the HTML (unless your
server provides an API with JSON responses - but then you don't
need clicks etc!) And iFrames make that worse since every iframe
effectively gets treated as a separate html document.

Then when your client receives the data it has to reparse
the html into a document structure before performing the search.


> concluded that perhaps I can't take a big chunk of that time out

You probably can, but only if you have access to the server
code and the network infrastructure and deep enough pockets
for a server upgrade or a new proxy server. Assuming that's
not the case then no, you need to look at other options.

But your first step has to be to measure the various stages
of the request. If the problem lies in the transmission
time across the network there is probably not much you
can do. If its in the database access (trickier to measure
if you don't have the server code - you need to create
some simultaneous equations using multiple test scenarios)
then you might be able to construct better queries (eg look
at a different page or only query the target iframe).

> considered something other than selenium, but since i think the problem lies
> on the server side, not sure it is worth the time.

It depends on the nature of the page. The best solution,
by far, is not to do web scraping. Its always the worst
case solution and to be avoided if at all possible. Try
to find an API with JSON or XML responses.

Also, are you sure you need to use the clock on the page?
Isn't the server clock adequate? In which case the
response time should be in every message header so there's
no need for web scraping at all...

Finally, I think there is an active Selenium discussion
forum so you could try there for more ideas.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor