Re: [R] Reducing the size of a large script top speed onset of execution

Dennis Fisher Sat, 09 Jan 2010 09:37:35 -0800

Professor Ripley,

Thanks for your suggestions.  I will look into the package approach.

As far as the "source" speed issue, you suggested that the problem mayrelate to guessing encodings so I added:

        options(encoding="UTF-8")

at the beginning of the code (was this the correct approach to theproblem?). That did not make any obvious difference to the durationto source the script. Do you have an specific suggestions that mightspeed the process?


Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

On Jan 9, 2010, at 8:53 AM, Prof Brian Ripley wrote:

Please just use make a package; then all the effort of parsing thecode is done at install time, you can use lazy-loading .... Or ifyou are for some reason averse to that, source the code into anenvironment, save that and simply attach() its save file next time.
Packages of that size load in a few milliseconds (as you see eachtime you start R: stats is 27000 lines).
source() is doing more work to allow it to guess encodings, keepingreferences to the original sources, back out code if the wholescript does not parse ....
On Sat, 9 Jan 2010, Dennis Fisher wrote:
Colleagues,

(R 2.10 on all platforms)
I have a lengthy script (18000 lines) that runs within a graphicalinterface. The script consists of 100's of function followed by asingle command that calls these functions (execution depends on anumber of environment variables passed to the script). As aresult, nothing is executed until the final line of code is read.It takes 15-20 seconds to load the code - I would like to speedthat process. Two questions:
1. The code contains numerous large blocks that are executed underonly one set of conditions (which are known when the code iscalled). For example, there might be code such as:
        if (CONDITION)
                {
                ... (hundreds of lines of code, including embedded curly 
brackets)
                } else invisible()
        if (!CONDITION)
                {
                ... (hundreds of lines of code, including embedded curly 
brackets)
                }
I assume that I could speed loading appreciably if I set up twoscripts, each of which excluded "irrelevant" code depending on theCONDITION. For example, if I knew that CONDITION was false, Iwould exclude the first block of code above; conversely, if I knowthat CONDITION was true, I would exclude the second block.
I would like to write code in R (or in sed [UNIX stream editor]) tocreate these two new scripts. However, the regular expressionsthat would be needed are beyond me and I would appreciate help fromthis forum. Specifically, I would like to search for:
        if (CONDITION or
        if (!CONDITION
as the start of the block and
        } - the matching curly bracket
at the end of the block, then remove those lines from the code.These text entries are always on a line by themselves. Finding the"if (CONDITION" line should be relatively easy. The difficulty forme is identifying the matching curly bracket - there are oftenpaired brackets within the block of code:
        if (CONDITION)
                {
                ...
                if (SOMETHINGELSE)      {       }
                if (YETANOTHER)
                        {
                        }
                }                               <-  this is the bracket that I 
need to match
There are also instances in which the entire block occurs on oneline:
        if (CONDITION)  { ...} else invisible()
or
        if (CONDITION ... else invisible()
Of note, I can remove the "else invisible() statements if they areproblematic to a solution.
2. A related issue regards loading in the graphical interface vs.loading at the command line (OS X). The graphical interface loadsin 15-20 seconds - the graphical interface is sending code asrapidly as it can. In contrast, at the command line, the course issource()'d and it takes 30-40 seconds. I would have expected thelatter approach to be as fast or faster because R would accept codeas fast as it could.
Does anyone have an explanation for this behavior; also, any ideasas to how to speed the process at the command line would beappreciated. Thanks for any suggestions.
Dennis




Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reducing the size of a large script top speed onset of execution

Reply via email to