[Rd] How to test impact of candidate changes to package?
I use a package to contain simple functions that can be handled by unit tests for correctness and more complex functions that combine the simple functions with business logic. Where there are proposals to change either the simple functions or the business logic, a sample needs to be run before the change and then after it to understand the impact of the change. I do this currently by 1. Using Rmarkdown documents 2. Loading the package as-is 3. Getting my sample 4. Running my sample through the package as-is and outputting table of results 5. sourceing new copies of functions 6. Running my sample again and outputting table of results 7. Reloading package and sourceing different copies of functions as required I really don't think this is a good way to do this as it risks missing downstream dependencies of the functions I'm trying to load into the global namespace to test. Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Cheers, Steph -- Stephanie Locke BI & Credit Risk Analyst The contents of this e-mail and of any attachments transmitted with it are confidential and intended solely for the use of the individual(s) to whom they are addressed. If you are not an intended recipient or the person responsible for delivering this e-mail to the intended recipient(s), you have received this e-mail in error, and access to it by you is not authorised. You may not use, copy, distribute, disclose or rely on it or any attachment, or any part of it or any attachment in any way. If you have received this e-mail in error please notify Optimum Credit Ltd by email on webmas...@optimumcredit.co.uk or phone on 03330 143125 and then delete it. All reasonable precautions have been taken to ensure that no viruses are present in E-mail. As we cannot accept responsibility for loss or damage arising from the use of E-mail or any attachment we recommend you subject these to your usual virus checking procedures prior to use. Any opinions expressed in this e-mail are those of the person sending it and not necessarily those of Optimum Credit Ltd. Optimum Credit Ltd, Haywood House South, Dumfries Place, Cardiff, CF10 3GA Authorised and regulated by the Financial Conduct Authority Optimum Credit Ltd. Registered office: Haywood House South, Dumfries Place, Cardiff, CF10 3GA. Registered in England and Wales No. 08698121 Calls may be monitored or recorded for training, compliance and evidential purposes. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to test impact of candidate changes to package?
Dear Stephanie, Have a look at the testthat package and the related article in the R Journal. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] Namens Stephanie Locke Verzonden: woensdag 10 september 2014 9:55 Aan: r-devel@r-project.org Onderwerp: [Rd] How to test impact of candidate changes to package? I use a package to contain simple functions that can be handled by unit tests for correctness and more complex functions that combine the simple functions with business logic. Where there are proposals to change either the simple functions or the business logic, a sample needs to be run before the change and then after it to understand the impact of the change. I do this currently by 1. Using Rmarkdown documents 2. Loading the package as-is 3. Getting my sample 4. Running my sample through the package as-is and outputting table of results 5. sourceing new copies of functions 6. Running my sample again and outputting table of results 7. Reloading package and sourceing different copies of functions as required I really don't think this is a good way to do this as it risks missing downstream dependencies of the functions I'm trying to load into the global namespace to test. Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Cheers, Steph -- Stephanie Locke BI & Credit Risk Analyst The contents of this e-mail and of any attachments transmitted with it are confidential and intended solely for the use of the individual(s) to whom they are addressed. If you are not an intended recipient or the person responsible for delivering this e-mail to the intended recipient(s), you have received this e-mail in error, and access to it by you is not authorised. You may not use, copy, distribute, disclose or rely on it or any attachment, or any part of it or any attachment in any way. If you have received this e-mail in error please notify Optimum Credit Ltd by email on webmas...@optimumcredit.co.uk or phone on 03330 143125 and then delete it. All reasonable precautions have been taken to ensure that no viruses are present in E-mail. As we cannot accept responsibility for loss or damage arising from the use of E-mail or any attachment we recommend you subject these to your usual virus checking procedures prior to use. Any opinions expressed in this e-mail are those of the person sending it and not necessarily those of Optimum Credit Ltd. Optimum Credit Ltd, Haywood House South, Dumfries Place, Cardiff, CF10 3GA Authorised and regulated by the Financial Conduct Authority Optimum Credit Ltd. Registered office: Haywood House South, Dumfries Place, Cardiff, CF10 3GA. Registered in England and Wales No. 08698121 Calls may be monitored or recorded for training, compliance and evidential purposes. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to test impact of candidate changes to package?
I have unit tests using testthat but these are typically of these types: 1) Check for correct calculation for a single set of valid inputs 2) Check for correct calculation for a larger set of valid inputs 3) Check for errors when providing incorrect inputs 4) Check for known frailties / past issues This is more for where changes are needed to functions that apply various bits of business logic that can change over time, so there is no "one answer". A unit test (at least as I understand it) can be worked through to make sure that given inputs, the output is computationally correct. What I'd like to do is overall the impact of a potential change by testing version 1 of a function in a package for a sample, then test version 2 of a function in a package for a sample and compare the results. My difficulties encountered so far is I'm reluctantly to manually do this change invasively by overwriting the relevant files in the R directory, and then say using devtools to load it and test it with testthat as I risk producing incorrect states of my package and potentially releasing the wrong thing. My preference would be a non-invasive method. Currently, where I'm trying to do this non-invasively I source a new version of the function stored in a separate directory, but some of the functions dependent on it continue to reference to the package version of the functions, this means that when I'm doing test #2 I have to load lots more functions and hope I've caught them all (or do some sort of dependency hunting programmatically). I may be missing something about testthat, but what I'm doing now seems to be nowhere near optimal and I'd love to have a better solution. Cheers Stephanie Locke BI & Credit Risk Analyst -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: 10 September 2014 09:30 To: Stephanie Locke; r-devel@r-project.org Subject: RE: How to test impact of candidate changes to package? Dear Stephanie, Have a look at the testthat package and the related article in the R Journal. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] Namens Stephanie Locke Verzonden: woensdag 10 september 2014 9:55 Aan: r-devel@r-project.org Onderwerp: [Rd] How to test impact of candidate changes to package? I use a package to contain simple functions that can be handled by unit tests for correctness and more complex functions that combine the simple functions with business logic. Where there are proposals to change either the simple functions or the business logic, a sample needs to be run before the change and then after it to understand the impact of the change. I do this currently by 1. Using Rmarkdown documents 2. Loading the package as-is 3. Getting my sample 4. Running my sample through the package as-is and outputting table of results 5. sourceing new copies of functions 6. Running my sample again and outputting table of results 7. Reloading package and sourceing different copies of functions as required I really don't think this is a good way to do this as it risks missing downstream dependencies of the functions I'm trying to load into the global namespace to test. Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Cheers, Steph -- Stephanie Locke BI & Credit Risk Analyst The contents of this e-mail and of any attachments transmitted with it are confidential and intended solely for the use of the individual(s) to whom they are addressed. If you are not an intended recipient or the person responsible for delivering this e-mail to the intended recipient(s), you have received this e-mail in error, and access to it by you is not authorised. You may not use, copy, distribute, disclose or rely on it or any attachment, or any part of it or any attachment in any way. If you have received this e-mail in error please notify Optimum Credit Ltd by email on webmas...@optimumcredit.co.uk or phone on 03330 143125 and then delete it. All reasonable precautions have been taken to ensure that no viruses are present in E-mail. As we cannot accept responsibility for loss or damag
Re: [Rd] How to test impact of candidate changes to package?
If you don't intend to keep the old business logic in the long run, perhaps a version control system such as Git can help you. If you use it in single-user mode, you can think of it as a backup system where you manually create each snapshot and give it a name, but it actually can do much more. For your use case, you can open a new *branch* where you implement your changes, and implement your testing logic simultaneously in both branches (using *merge* operations). The system handles switching between branches, so you can really perform invasive changes, and revert if you find that a particular change breaks something. RStudio has Git support, but you probably need to use the shell to create a branch. On Windows or OS X the GitHub client helps you to get started. Cheers Kirill On 09/10/2014 11:14 AM, Stephanie Locke wrote: I have unit tests using testthat but these are typically of these types: 1) Check for correct calculation for a single set of valid inputs 2) Check for correct calculation for a larger set of valid inputs 3) Check for errors when providing incorrect inputs 4) Check for known frailties / past issues This is more for where changes are needed to functions that apply various bits of business logic that can change over time, so there is no "one answer". A unit test (at least as I understand it) can be worked through to make sure that given inputs, the output is computationally correct. What I'd like to do is overall the impact of a potential change by testing version 1 of a function in a package for a sample, then test version 2 of a function in a package for a sample and compare the results. My difficulties encountered so far is I'm reluctantly to manually do this change invasively by overwriting the relevant files in the R directory, and then say using devtools to load it and test it with testthat as I risk producing incorrect states of my package and potentially releasing the wrong thing. My preference would be a non-invasive method. Currently, where I'm trying to do this non-invasively I source a new version of the function stored in a separate directory, but some of the functions dependent on it continue to reference to the package version of the functions, this means that when I'm doing test #2 I have to load lots more functions and hope I've caught them all (or do some sort of dependency hunting programmatically). I may be missing something about testthat, but what I'm doing now seems to be nowhere near optimal and I'd love to have a better solution. Cheers Stephanie Locke BI & Credit Risk Analyst -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: 10 September 2014 09:30 To: Stephanie Locke; r-devel@r-project.org Subject: RE: How to test impact of candidate changes to package? Dear Stephanie, Have a look at the testthat package and the related article in the R Journal. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] Namens Stephanie Locke Verzonden: woensdag 10 september 2014 9:55 Aan: r-devel@r-project.org Onderwerp: [Rd] How to test impact of candidate changes to package? I use a package to contain simple functions that can be handled by unit tests for correctness and more complex functions that combine the simple functions with business logic. Where there are proposals to change either the simple functions or the business logic, a sample needs to be run before the change and then after it to understand the impact of the change. I do this currently by 1. Using Rmarkdown documents 2. Loading the package as-is 3. Getting my sample 4. Running my sample through the package as-is and outputting table of results 5. sourceing new copies of functions 6. Running my sample again and outputting table of results 7. Reloading package and sourceing different copies of functions as required I really don't think this is a good way to do this as it risks missing downstream dependencies of the functions I'm trying to load into the global namespace to test. Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Cheers, Steph -- Stepha
[Rd] CRAN form submission confirmation link
There is a small problem in the CRAN submission form, which is not super urgent but probably good to be aware of. So I noticed that after I submitted a package, the submission was confirmed without me actually clicking the link in the confirmation email (which could be a potential security risk). I suspect that this happens because many modern browsers use pre-rendering, which retrieves hyperlinks on a page before the user actually clicks on it. This is perfectly legal because the HTTP GET method [1] is defined to be "safe" and "idempotent", and therefore a GET request should never change server state. And this is where the current implementation of the confirmation page might violate HTTP. I think the proper way to implement this would be if the link in the confirmation email would lead to a page where the user has to click a button which results in a POST request to confirm the submission. [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to test impact of candidate changes to package?
On 09/10/2014 06:12 AM, Kirill Müller wrote: If you don't intend to keep the old business logic in the long run, perhaps a version control system such as Git can help you. If you use it in single-user mode, you can think of it as a backup system where you manually create each snapshot and give it a name, but it actually can do much more. For your use case, you can open a new *branch* where you implement your changes, and implement your testing logic simultaneously in both branches (using *merge* operations). The system handles switching between branches, so you can really perform invasive changes, and revert if you find that a particular change breaks something. ... Yes, I would strongly recommend some version control system for this, probably either Git or svn (Subversion). If this is all code and test data that you can release publicly then you might choose some public repository like Github or R-forge. (You will get lots of opinions about the relative merits of different repositories if you ask, but the main point is that any one of them will be better than nothing.) If part of your code and data cannot be released then you might check if something is already supported in your place of business. Chances are that it is, but only programmers in IT have been told about it. On 09/10/2014 11:14 AM, Stephanie Locke wrote: >> ... Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Most package maintainers would face some version of this problem, some simpler and some much more complicated. If you set up the tests as scripts in the package tests/ directory that issue stop() in the case of a problem, then R-forge pretty much does the checking for you on multiple platforms, at least when it is working properly. It is probably more trouble than it is worth for a single package, but if you have several packages with inter-dependencies then you might want to look at the develMake framework at http://automater.r-forge.r-project.org/ Regards, Paul Cheers, Steph -- Stephanie Locke BI & Credit Risk Analyst __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem with order() and I()
Early on I had been wondering if deprecating I() and the AsIs class would be a way to get the problem to go away. I imagine (based on no data at all!) that they are rarely used. If I were writing the same code today, I would use options(stringsAsFactors=FALSE) instead of sprinkling I() here and there throughout my scripts. But I see from the discussions that there’s something deeper going on. Thanks for continuing to cc me; I find it interesting. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 9/9/14, 9:35 AM, "Martin Maechler" wrote: >> peter dalgaard >> on Tue, 9 Sep 2014 16:36:19 +0200 writes: > >> It's actually a little more complicated. I wrote a note, but it >seems to be stuck in the outbox on my home machine (I probably forgot to >click Send...). >> One important aspect is that > >>> "x" < "\265g" >> [1] NA > >> which makes me wonder if the bug really is in the case that >"works". It seems that it is possible to rank() character vectors that >contain incomparable elements. > >> -pd > >yes you are right that it is even more complicated. >In both cases, our Scollate() is involved, >(Scollate: the one where we had a discussion about making it part of the C > level R API, which would help package authors ..) > >After > > ch <- c('x','\265g') > foo <- I(ch) > >Of the four expressions, > > order(ch) > order(foo) > ch [1] < ch [2] > foo[1] < foo[2] > >only the first one "works", the others give NA or an error because of NA >and the first one is the only of the 4 that does not use >do_relop_dflt() > >It's not even clear what we'd want (as I think pd also alluded to): >Ideally all of these should work consistently, which because of > "<(.,.)" returning NA in both cases, >would mean that order(ch) also should give an error as order(foo) >{{ an error we should improve the message in any case!!}. >Big Q: Can we afford order(ch) giving an error in such cases. >Pretty high chance that this will "break" much user (and probably >even package) code out there. > >Still, the other solution, namely order(foo) behaving as >order(ch) now does would correspond to the ">" giving FALSE >instead of NA, so this solution is not ok in my view. > >Martin > > >> On 09 Sep 2014, at 16:19 , Martin Maechler > wrote: > >>>> MacQueen, Don >>>> on Mon, 8 Sep 2014 16:06:21 + writes: >>> >>>> I have found that order() fails in a rather arcane circumstance, >as in >>>> this example: >>> > foo <- I( c('x','\265g') ) > order(foo) >>>> Error in if (xi > xj) 1L else -1L : missing value where >TRUE/FALSE needed >>> > foo <-c('x','\265g') > order(foo) >>>> [1] 1 2 >>> >>> yes, this is not desirable. >>> order() in such cases calls xtfrm() {as documented} >>> and that ends up calling rank() and then the internal .gt() >>> where the bug happens because >>> >>>> I("x") > I("\xb5g") >>> [1] NA >>> >>> but really I think the change should happen in xtfrm.Asis(.) >>> which I think should drop the class also in this case. >>> >>> More on this, once we have fixed it. >>> >>> Thank you, Don, very much! >>> >>> Martin Maechler, >>> ETH Zurich >>> > sessionInfo() >>>> R version 3.1.1 (2014-07-10) >>>> Platform: x86_64-apple-darwin13.1.0 (64-bit) >>> >>>> locale: >>>> [1] C >>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods >base >>> >>>> Thanks >>>> -Don >>> >>>> p.s. >>>> Just a little background, irrelevant unless one wonders why I¹m >using I() >>>> and \265: >>> >>>> If I were writing new code I wouldn¹t be using I(), since there >are better >>>> ways now to achieve the same end (preventing the creation of >factors in >>>> data frames), but the scripts that use it are quite old, >originally >>>> developed in 2001. >>> >>>> In at least some but perhaps limited contexts, Œ\265¹ produces >the greek >>>> letter mu, and that¹s why I¹m using it. And if I remember >correctly, 2001 >>>> is prior to the current R support for locales and extended >character sets. >>>> Using \265 is what I could find at that time to get a mu into my >output. >>> >>>> I came across this while checking some things; it¹s not actually >breaking >>>> my scripts, so I doubt it¹s due to any recent change. >>> >>> >>>> -- >>>> Don MacQueen >>> >>>> Lawrence Livermore National Laboratory >>>> 7000 East Ave., L-627 >>>> Livermore, CA 94550 >>>> 925-423-1062 >>> >>>> __ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>>
[Rd] install.packages misleads about package availability?
In the context of installing a Bioconductor package using our biocLite() function, install.packages() warns > install.packages("RUVSeq", repos="http://bioconductor.org/packages/2.14/bioc";) Installing package into '/home/mtmorgan/R/x86_64-unknown-linux-gnu-library/3.1-2.14' (as 'lib' is unspecified) Warning message: package 'RUVSeq' is not available (for R version 3.1.1 Patched) but really the problem is that the package is not available at the specified repository (it is available, for the same version of R, in the Bioc devel repository http://bioconductor.org/packages/3.0/bioc). I can see the value of identifying the R version, and see that mentioning something about 'specified repositories' would not necessarily be helpful. Also, since the message is translated and our user base is international, it is difficult to catch and process by the biocLite() script. Is there a revised wording that could be employed to more accurately convey the reason for the failure, or is this an opportunity to use the condition system? Index: src/library/utils/R/packages2.R === --- src/library/utils/R/packages2.R (revision 66562) +++ src/library/utils/R/packages2.R (working copy) @@ -46,12 +46,12 @@ p0 <- unique(pkgs) miss <- !p0 %in% row.names(available) if(sum(miss)) { - warning(sprintf(ngettext(sum(miss), -"package %s is not available (for %s)", -"packages %s are not available (for %s)"), - paste(sQuote(p0[miss]), collapse=", "), - sub(" *\\(.*","", R.version.string)), -domain = NA, call. = FALSE) +txt <- ngettext(sum(miss), "package %s is not available (for %s)", +"packages %s are not available (for %s)") +msg <- simpleWarning(sprintf(txt, paste(sQuote(p0[miss]), collapse=", "), + sub(" *\\(.*","", R.version.string))) +class(msg) <- c("packageNotAvailable", class(msg)) +warning(msg) if (sum(miss) == 1L && !is.na(w <- match(tolower(p0[miss]), tolower(row.names(available) { -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel