Re: [Tutor] Performance Issue

2018-10-18 Thread Alan Gauld via Tutor
   Cc'ing list. Please use reply all on responses to tutor.
   If you have no control over the server, eh access to logs etc, then the
   best you can do it record the time just before sending the request and
   immediately you get the reply. That part is outside your control. If the
   remaining time is worth optimising then look to your code.
   As to the server time, it should be in the http headers so you don't need
   to parse the html, just read the headers. Much faster.
   HRH,
   Alan g.
   On 18 Oct 2018 10:40 am, User2002  wrote:

 Thank you for your thoughtful reply. There are some good ideas in there
 for
 me.

 I have asked and there either is no API or they do not want outsiders to
 have access to it, so I think that is a dead end.

 For my purposes, improved performance focuses on significantly reducing
 the
 time required to successfully execute the
 br.find_element_by_link_text(str(day_to_book)).click() command. If my
 overall time to successfully book a reservation is 1.8 seconds and
 roughly 2
 seconds is spent on this single instruction, then a .5 second
 improvement
 represents a 25% reduction. I have a second command (same type, just on
 the
 next iframe) that is similarly slow. So fixing both could represent a
 50%
 reduction. Given the demand for the reservations (there are literally
 hundreds of people out there pounding their keyboard/clicking on fields)
 every second counts. (With an automated ability to book a reservation, I
 am
 probably faster than anyone's ability to click on a field, wait for a
 reply,
 reposition the cursor, click again, etc., but I am the point where this
 has
 become something I am fully invested in and would like to take as far as
 I
 can.)

 Most of your ideas center around the notion of knowing more about where
 the
 time delay occurs in the processing steps that occur outside on my world
 -
 communication back and forth to the server, etc. I must confess, I have
 no
 idea of how to do this. How can I measure what goes on outside my
 machine
 and measure the component parts? If you have an idea in this area or
 could
 refer me to where I could go to read and learn, I'd be very grateful.

 Finally, regarding your notion of web scraping, server clock, etc.
 Literally
 the only thing I 'scrape' is the server time to ensure I click on the
 date
 field at exactly 7:00:00. Once I get to that point, I click on a date
 field,
 then I click on a time field and I am done - no scraping occurs once it
 reaches 7:00:00. So I am not sure there are improvements to be made in
 that
 area.

 -Original Message-
 From: Tutor  On Behalf Of
 Alan Gauld via Tutor
 Sent: Wednesday, October 17, 2018 8:12 PM
 To: tutor@python.org
 Subject: Re: [Tutor] Performance Issue

 On 17/10/18 22:25, Stephen Smith wrote:
 > I have written a screen scraping program that watches a clock (on the
 > app's
 > server) and at 7:00:00 AM dashes to make a reservation on line. It
 > works fine. However, i have spent time trying to improve its
 > performance. I am using selenium, with chrome driver.

 When doing performance tuning the first thing to answer is what does
 improved performance mean. For example in a Word Processor improving the
 speed that an input character appears on screen by 10% is unlikely to be
 a
 worthwhile exercise. But improving the time taken to do a global
 search/replace by 10% might well be worthwhile.

 So what do you want to improve about an app that spends most of its time
 waiting for a change on a remote server (presumably by polling?) Is it
 the
 speed/frequency of polling? The speed of reading the response? The speed
 of
 processing the response?

 And knowing what you want to improve have you measured it to see where
 the
 time is being spent? Is it in the client request? The transmission to
 the
 server? the server processing? the transmission from the server? the
 reading
 of that response? or the processing of that response? You need to time
 each
 of those phases accurately to find out which bits are worth improving.

 > Here is what i have learned. I have tried various methods to find (by
 > link_text, by_xpath, etc.) and click on the element in question (shown
 > below). When i find the element with no click, the find process takes
 > about
 > .02 seconds. When i find it with a click (i need to select the element
 > and move to the next iframe) it takes over a second. I get these same
 > results no matter which find_element_by variation i use and i get the
 > same times in headless or normal mode.
 >
 > Here is my theory - finding the element is relatively simple in the
 > html alread

Re: [Tutor] Performance Issue

2018-10-18 Thread User2002
Thanks to all for your indulgence and help…

 

From: Alan Gauld  
Sent: Thursday, October 18, 2018 6:53 AM
To: User2002 
Cc: tutor@python.org
Subject: RE: [Tutor] Performance Issue

 

Cc'ing list. Please use reply all on responses to tutor.

 

If you have no control over the server, eh access to logs etc, then the best 
you can do it record the time just before sending the request and immediately 
you get the reply. That part is outside your control. If the remaining time is 
worth optimising then look to your code.

 

As to the server time, it should be in the http headers so you don't need to 
parse the html, just read the headers. Much faster.

 

HRH,

 

Alan g.

 

On 18 Oct 2018 10:40 am, User2002 mailto:user2...@comcast.net> > wrote:

Thank you for your thoughtful reply. There are some good ideas in there for 
me. 

I have asked and there either is no API or they do not want outsiders to 
have access to it, so I think that is a dead end. 

For my purposes, improved performance focuses on significantly reducing the 
time required to successfully execute the 
br.find_element_by_link_text(str(day_to_book)).click() command. If my 
overall time to successfully book a reservation is 1.8 seconds and roughly 2 
seconds is spent on this single instruction, then a .5 second improvement 
represents a 25% reduction. I have a second command (same type, just on the 
next iframe) that is similarly slow. So fixing both could represent a 50% 
reduction. Given the demand for the reservations (there are literally 
hundreds of people out there pounding their keyboard/clicking on fields) 
every second counts. (With an automated ability to book a reservation, I am 
probably faster than anyone's ability to click on a field, wait for a reply, 
reposition the cursor, click again, etc., but I am the point where this has 
become something I am fully invested in and would like to take as far as I 
can.) 

Most of your ideas center around the notion of knowing more about where the 
time delay occurs in the processing steps that occur outside on my world - 
communication back and forth to the server, etc. I must confess, I have no 
idea of how to do this. How can I measure what goes on outside my machine 
and measure the component parts? If you have an idea in this area or could 
refer me to where I could go to read and learn, I'd be very grateful. 

Finally, regarding your notion of web scraping, server clock, etc. Literally 
the only thing I 'scrape' is the server time to ensure I click on the date 
field at exactly 7:00:00. Once I get to that point, I click on a date field, 
then I click on a time field and I am done - no scraping occurs once it 
reaches 7:00:00. So I am not sure there are improvements to be made in that 
area. 

-Original Message- 
From: Tutor mailto:tutor-bounces+user2002=comcast@python.org> > On Behalf Of 
Alan Gauld via Tutor 
Sent: Wednesday, October 17, 2018 8:12 PM 
To: tutor@python.org   
Subject: Re: [Tutor] Performance Issue 

On 17/10/18 22:25, Stephen Smith wrote: 
> I have written a screen scraping program that watches a clock (on the 
> app's 
> server) and at 7:00:00 AM dashes to make a reservation on line. It 
> works fine. However, i have spent time trying to improve its 
> performance. I am using selenium, with chrome driver. 

When doing performance tuning the first thing to answer is what does 
improved performance mean. For example in a Word Processor improving the 
speed that an input character appears on screen by 10% is unlikely to be a 
worthwhile exercise. But improving the time taken to do a global 
search/replace by 10% might well be worthwhile. 

So what do you want to improve about an app that spends most of its time 
waiting for a change on a remote server (presumably by polling?) Is it the 
speed/frequency of polling? The speed of reading the response? The speed of 
processing the response? 

And knowing what you want to improve have you measured it to see where the 
time is being spent? Is it in the client request? The transmission to the 
server? the server processing? the transmission from the server? the reading 
of that response? or the processing of that response? You need to time each 
of those phases accurately to find out which bits are worth improving. 

> Here is what i have learned. I have tried various methods to find (by 
> link_text, by_xpath, etc.) and click on the element in question (shown 
> below). When i find the element with no click, the find process takes 
> about 
> .02 seconds. When i find it with a click (i need to select the element 
> and move to the next iframe) it takes over a second. I get these same 
> results no matter which find_element_by variation i use and i get the 
> same times in headless or normal mode. 
> 
> Here is my theory - finding the element is relatively simple in the 
> html already loaded into my machine - hence .02 seconds. However, when 
> i click on the element, processing goes out