Re: [Rd] A bug in princomp(), perhaps?
> FWIW this seems to be a FAQ: https://stat.ethz.ch/pipermail/r-devel/2003-July/027018.html http://thr3ads.net/r-devel/2013/01/ 2171832-Re-na.omit-option-in-prcomp-formula-interface-only http://r.789695.n4.nabble.com/ na-omit-option-in-prcomp-formula-interface-only-td4373533.html And two StackOverflow questions (the latter's a bit tangential): http://stackoverflow.com/questions/12078291/ r-function-prcomp-fails-with-nas-values-even-though-nas-are-allowed http://stackoverflow.com/questions/23421438/what-was-wrong-with-running-princomp-in-r/23446938#23446938 (Sorry for broken URLs and random assortment of mailing list aggregators.) I appreciate Gavin's points that implementing this stuff for princomp.default is difficult/problematic, but I second Ravi's request for a little more clarification in the help text; it's quite easy to miss the fact that 'na.action' is defined for princomp.formula but not for princomp.default. Perhaps just "Note that setting na.action works for princomp.formula, but not for princomp.default" (under the "na.action" argument description). Gavin Simpson gmail.com> writes: > > On 30 May 2014 06:33, Ravi Varadhan jhu.edu> wrote: > > > > Thank you, Peter. Now I see that. > > > > I still think the documentation of `na.action' > > can be made more explicit > > to state that this option is only used for princomp.formula. > > > > Best regards, > > Ravi > > > > > > -Original Message- > > From: peter dalgaard [mailto:pdalgd gmail.com] > > Sent: Friday, May 30, 2014 5:15 AM > > To: Ravi Varadhan > > Cc: r-devel r-project.org > > Subject: Re: [Rd] A bug in princomp(), perhaps? > > > > It's only documented to work for princomp.formula; > > other methods do not > > know about na.action. > > > > -pd > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: > Hi Kevin, > > I tend to adopt Henrik's idea, i.e., to provide vignette > engines that just ignore tangle. At the moment, it seems R CMD check > is comfortable with vignettes that do not have corresponding R > scripts, and I hope these R scripts will not become mandatory in the > future. > I'm not sure this is the right approach. This would essentially make the test optional based on decisions by the package author. I'm not arguing in favor if this particular test, but if package authors are able to turn a test off then the test loses quite a bit of it's value. I think that R CMD check has done a great deal for the R community by presenting a uniform, minimum "barrier to entry" for R packages. Allowing package developers to alter the tests it does (other than the obvious case of their own unit tests) would remove that. That having been said, it seems to me that tangle-like utilities should have the option of extracting inline code, and that during R CMD check that option should *always* be turned on. That would solve the problem in question while retaining the test would it not? ~G > > Thanks everyone for your comments! > > Regards, > Yihui > -- > Yihui Xie > Web: http://yihui.name > > > On Fri, May 30, 2014 at 8:21 AM, Kevin Coombes > wrote: > > Hi, > > > > Unless someone is planning to change Stangle to include inline > expressions > > (which I am *not* advocating), I think that relying on side-effects > within > > an \Sexpr construction is a bad idea. So, my own coding style is to > restrict > > my use of \Sexpr to calls of the form > > \Sexpr{show.the.value.of.this.variable}. As a result, I more-or-less > believe > > that having R CMD check use Stangle and report an error is probably a > good > > thing. > > > > There is a completely separate questions about the relationship between > > Sweave/Stangle or knit/purl and literate programming that is linked to > your > > question about whether to use Stangle on vignettes. The underlying > model(s) > > in R have drifted away from Knuth's original conception, for some good > > reasons. > > > > The original goal of literate programming was to be able to explain the > > algorithms and data structures in the code to humans. For that purpose, > it > > was important to have named code chunks that you could move around, which > > would allow you to describe the algorithm starting from a high level > > overview and then drilling down into the details. From this perspective, > > "tangle" was critical to being able to reconstruct a program that would > > compile and run correctly. > > > > The vast majority of applications of Sweave/Stangle or knit/purl in > modern R > > have a completely different goal: to produce some sort of document that > > describes the results of an analysis to a non-programmer or > > non-statistician. For this goal, "weave" is much more important than > > "tangle", because the most important aspect is the ability to integrate > the > > results (figures, tables, etc) of running the code into the document that > > get passed off to the person for whom the analysis was prepared. As a > > result, the number of times in my daily work that I need to explicitly > > invoke Stangle (or purl) explicitly is many orders of magnitude smaller > than > > the number of times that I invoke Sweave (or knitr). > > > > -- Kevin > > > > > > > > On 5/30/2014 1:04 AM, Yihui Xie wrote: > >> > >> Hi, > >> > >> Recently I saw a couple of cases in which the package vignettes were > >> somewhat complicated so that Stangle() (or knitr::purl() or other > >> tangling functions) can fail to produce the exact R code that is > >> executed by the weaving function Sweave() (or knitr::knit(), ...). For > >> example, this is a valid document that can pass the weaving process > >> but cannot generate a valid R script to be source()d: > >> > >> \documentclass{article} > >> \begin{document} > >> Assign 1 to x: \Sexpr{x <- 1} > >> <<>>= > >> x + 1 > >> @ > >> \end{document} > >> > >> That is because the inline R code is not written to the R script > >> during the tangling process. When an R package vignette contains > >> inline R code expressions that have significant side effects, R CMD > >> check can fail because the tangled output is not correct. What I > >> showed here is only a trivial example, and I have seen two packages > >> that have more complicated scenarios than this. Anyway, the key thing > >> that I want to discuss here is, since the R code in the vignette has > >> been executed once during the weaving process, does it make much sense > >> to execute the code generated from the tangle function? In other > >> words, if the weaving process has succeeded, is it necessary to > >> source() the R script again? > >> > >> The two options here are: > >> > >> 1. Do not check the R code from vignettes; > >> 2. Or fix the tangle function so that it produces exactly what was > >> executed in the weaving process. If this
Re: [Rd] R CMD check for the R code from vignettes
Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker wrote: > > > > On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: >> >> Hi Kevin, >> >> >> I tend to adopt Henrik's idea, i.e., to provide vignette >> engines that just ignore tangle. At the moment, it seems R CMD check >> is comfortable with vignettes that do not have corresponding R >> scripts, and I hope these R scripts will not become mandatory in the >> future. > > > I'm not sure this is the right approach. This would essentially make the > test optional based on decisions by the package author. I'm not arguing in > favor if this particular test, but if package authors are able to turn a > test off then the test loses quite a bit of it's value. > > I think that R CMD check has done a great deal for the R community by > presenting a uniform, minimum "barrier to entry" for R packages. Allowing > package developers to alter the tests it does (other than the obvious case > of their own unit tests) would remove that. > > That having been said, it seems to me that tangle-like utilities should have > the option of extracting inline code, and that during R CMD check that > option should *always* be turned on. That would solve the problem in > question while retaining the test would it not? > > ~G __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
Vignettes can fail to build for reasons unrelated to code. In that case it seems useful to the developer to know whether the the code is failing (indicating a likely problem in the package itself) or just the TeX in the vignette. Also, I could be wrong about this, but I thought the "run the vignette code" test happened *before* vignette building. ~G On Sat, May 31, 2014 at 3:52 PM, Yihui Xie wrote: > Note the test has been done once in weave, since R CMD check will try > to rebuild vignettes. The problem is whether the related tools in R > should change their tangle utilities so we can **repeat** the test, > and it seems the answer is "no" in my eyes. > > Regards, > Yihui > -- > Yihui Xie > Web: http://yihui.name > > > On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker > wrote: > > > > > > > > On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: > >> > >> Hi Kevin, > >> > >> > >> I tend to adopt Henrik's idea, i.e., to provide vignette > >> engines that just ignore tangle. At the moment, it seems R CMD check > >> is comfortable with vignettes that do not have corresponding R > >> scripts, and I hope these R scripts will not become mandatory in the > >> future. > > > > > > I'm not sure this is the right approach. This would essentially make the > > test optional based on decisions by the package author. I'm not arguing > in > > favor if this particular test, but if package authors are able to turn a > > test off then the test loses quite a bit of it's value. > > > > I think that R CMD check has done a great deal for the R community by > > presenting a uniform, minimum "barrier to entry" for R packages. Allowing > > package developers to alter the tests it does (other than the obvious > case > > of their own unit tests) would remove that. > > > > That having been said, it seems to me that tangle-like utilities should > have > > the option of extracting inline code, and that during R CMD check that > > option should *always* be turned on. That would solve the problem in > > question while retaining the test would it not? > > > > ~G > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
On 05/31/2014 03:52 PM, Yihui Xie wrote: Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker wrote: On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check It is very useful, pedagogically and when reproducing analyses, to be able to source() the tangled .R code into an R session, analogous to running example code with example(). The documentation for ?Stangle does read (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.) So my 'vote' (recognizing that I don't have one of those) is to incorporate \Sexpr{} expressions into the tangled code, or to continue to flag use of Sexpr with side effects as errors (indirectly, by source()ing the tangled code), rather than writing engines that ignore tangle. It is very valuable to all parties to write a vignette with code that is fully evaluated; otherwise, it is too easy for bit rot to seep in, or to 'fake' it in a way that seems innocent but is misleading. Martin Morgan is comfortable with vignettes that do not have corresponding R scripts, and I hope these R scripts will not become mandatory in the future. I'm not sure this is the right approach. This would essentially make the test optional based on decisions by the package author. I'm not arguing in favor if this particular test, but if package authors are able to turn a test off then the test loses quite a bit of it's value. I think that R CMD check has done a great deal for the R community by presenting a uniform, minimum "barrier to entry" for R packages. Allowing package developers to alter the tests it does (other than the obvious case of their own unit tests) would remove that. That having been said, it seems to me that tangle-like utilities should have the option of extracting inline code, and that during R CMD check that option should *always* be turned on. That would solve the problem in question while retaining the test would it not? ~G __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
I mentioned in my original post that Sweave()/knit()/... can be considered as the "new" source(). They can do the same thing as source() does. I agree that fully evaluating the code is valuable, but it is not a problem since the weave functions do fully evaluate the code. If there is a reason for why source() an R script is preferred, I guess it is users' familiarity with .R instead of .Rnw/.Rmd/..., however, I guess it would be painful to read the pure R script tangled from the source document without the original narratives. So what do we really lose if we turn off tangle? We lose an R script as a derivative from the source document, but we do not lose the code evaluation. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 6:20 PM, Martin Morgan wrote: > On 05/31/2014 03:52 PM, Yihui Xie wrote: >> >> Note the test has been done once in weave, since R CMD check will try >> to rebuild vignettes. The problem is whether the related tools in R >> should change their tangle utilities so we can **repeat** the test, >> and it seems the answer is "no" in my eyes. >> >> Regards, >> Yihui >> -- >> Yihui Xie >> Web: http://yihui.name >> >> >> On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker >> wrote: >>> >>> >>> >>> >>> On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check > > > It is very useful, pedagogically and when reproducing analyses, to be able > to source() the tangled .R code into an R session, analogous to running > example code with example(). The documentation for ?Stangle does read > > (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.) > > So my 'vote' (recognizing that I don't have one of those) is to incorporate > \Sexpr{} expressions into the tangled code, or to continue to flag use of > Sexpr with side effects as errors (indirectly, by source()ing the tangled > code), rather than writing engines that ignore tangle. > > It is very valuable to all parties to write a vignette with code that is > fully evaluated; otherwise, it is too easy for bit rot to seep in, or to > 'fake' it in a way that seems innocent but is misleading. > > Martin Morgan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
The Bioconductor project has a substantial amount of teaching material in the form of Sweave files. For teaching, it can be extremely convenient to give people an R script which they can copy and paste from (or do something else with). This is especially true for inexperienced R users. Best, Kasper On Sat, May 31, 2014 at 9:54 PM, Yihui Xie wrote: > I mentioned in my original post that Sweave()/knit()/... can be > considered as the "new" source(). They can do the same thing as > source() does. I agree that fully evaluating the code is valuable, but > it is not a problem since the weave functions do fully evaluate the > code. If there is a reason for why source() an R script is preferred, > I guess it is users' familiarity with .R instead of .Rnw/.Rmd/..., > however, I guess it would be painful to read the pure R script tangled > from the source document without the original narratives. > > So what do we really lose if we turn off tangle? We lose an R script > as a derivative from the source document, but we do not lose the code > evaluation. > > Regards, > Yihui > -- > Yihui Xie > Web: http://yihui.name > > > On Sat, May 31, 2014 at 6:20 PM, Martin Morgan wrote: > > On 05/31/2014 03:52 PM, Yihui Xie wrote: > >> > >> Note the test has been done once in weave, since R CMD check will try > >> to rebuild vignettes. The problem is whether the related tools in R > >> should change their tangle utilities so we can **repeat** the test, > >> and it seems the answer is "no" in my eyes. > >> > >> Regards, > >> Yihui > >> -- > >> Yihui Xie > >> Web: http://yihui.name > >> > >> > >> On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker > >> wrote: > >>> > >>> > >>> > >>> > >>> On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: > > > Hi Kevin, > > > I tend to adopt Henrik's idea, i.e., to provide vignette > engines that just ignore tangle. At the moment, it seems R CMD check > > > > > > It is very useful, pedagogically and when reproducing analyses, to be > able > > to source() the tangled .R code into an R session, analogous to running > > example code with example(). The documentation for ?Stangle does read > > > > (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.) > > > > So my 'vote' (recognizing that I don't have one of those) is to > incorporate > > \Sexpr{} expressions into the tangled code, or to continue to flag use of > > Sexpr with side effects as errors (indirectly, by source()ing the tangled > > code), rather than writing engines that ignore tangle. > > > > It is very valuable to all parties to write a vignette with code that is > > fully evaluated; otherwise, it is too easy for bit rot to seep in, or to > > 'fake' it in a way that seems innocent but is misleading. > > > > Martin Morgan > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
On Sat, May 31, 2014 at 6:54 PM, Yihui Xie wrote: I agree that fully evaluating the code is valuable, but > it is not a problem since the weave functions do fully evaluate the > code. If there is a reason for why source() an R script is preferred, > I guess it is users' familiarity with .R instead of .Rnw/.Rmd/..., > It's because .Rnw and Rmd require more from the user than .R. Also, this started with vignettes but you seem to be talking more generally. If so, I would point out that not all R code is intended to generate reports, and writing pure R code that isn't going to generate a report in an .Rnw/.Rmd file would be very strange to say the least. > however, I guess it would be painful to read the pure R script tangled > from the source document without the original narratives. > That depends a lot on what you want. Reading an woven article/report that includes code and reading code are different and equally valid activities. Sometimes I really just want to know what the author actually told the computer to do. > > So what do we really lose if we turn off tangle? We lose an R script > as a derivative from the source document, but we do not lose the code > evaluation. > We lose *isolated* code evaluation. Sweave/knit have a lot more moving pieces than source/eval do. Many of which are for the purpose of displaying output, rather than running code. > > Regards, > Yihui > -- > Yihui Xie > Web: http://yihui.name > > > On Sat, May 31, 2014 at 6:20 PM, Martin Morgan wrote: > > On 05/31/2014 03:52 PM, Yihui Xie wrote: > >> > >> Note the test has been done once in weave, since R CMD check will try > >> to rebuild vignettes. The problem is whether the related tools in R > >> should change their tangle utilities so we can **repeat** the test, > >> and it seems the answer is "no" in my eyes. > >> > >> Regards, > >> Yihui > >> -- > >> Yihui Xie > >> Web: http://yihui.name > >> > >> > >> On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker > >> wrote: > >>> > >>> > >>> > >>> > >>> On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: > > > Hi Kevin, > > > I tend to adopt Henrik's idea, i.e., to provide vignette > engines that just ignore tangle. At the moment, it seems R CMD check > > > > > > It is very useful, pedagogically and when reproducing analyses, to be > able > > to source() the tangled .R code into an R session, analogous to running > > example code with example(). The documentation for ?Stangle does read > > > > (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.) > > > > So my 'vote' (recognizing that I don't have one of those) is to > incorporate > > \Sexpr{} expressions into the tangled code, or to continue to flag use of > > Sexpr with side effects as errors (indirectly, by source()ing the tangled > > code), rather than writing engines that ignore tangle. > > > > It is very valuable to all parties to write a vignette with code that is > > fully evaluated; otherwise, it is too easy for bit rot to seep in, or to > > 'fake' it in a way that seems innocent but is misleading. > > > > Martin Morgan > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
Yes, that is a matter of familiarity as I mentioned, isn't it? I understand this justification. I can argue that it is also convenient to give people an Rnw/Rmd document and they can easily run the R code chunks as well (e.g. in RStudio, chunk navigation and evaluation are pretty simple) _within_ the context of your teaching materials. However, I think this is drifting away from the original topic, so I'll stop my comments on the direction of teaching. The original question was, what do we lose if we disable tangle for R package vignettes? Please also note I mean this is _optional_, i.e. package authors can _choose_ whether they want to disable tangle. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 9:11 PM, Kasper Daniel Hansen wrote: > The Bioconductor project has a substantial amount of teaching material in > the form of Sweave files. For teaching, it can be extremely convenient to > give people an R script which they can copy and paste from (or do something > else with). This is especially true for inexperienced R users. > > Best, > Kasper > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
1. The starting point of this discussion is package vignettes, instead of R scripts. I'm not saying we should abandon R scripts, or all people should write R code to generate reports. Starting from a package vignette, you can evaluate it using a weave function, or evaluate its derivative, namely an R script. I was saying the former might not be a bad idea, although the latter sounds more familiar to most R users. For a package vignette, within the context of R CMD check, is it necessary to do tangle + evaluate _besides_ weave? 2. If you are comfortable with reading pure code without narratives, I'm totally fine with that. I guess there is nothing to argue on this point, since it is pretty much personal taste. 3. Yes, you are absolutely correct -- Sweave()/knit() does more than source(), but let me repeat the issue to be discussed: what harm does it bring if we disable tangle for R package vignettes? Sorry if I did not make it clear enough, my priority of this discussion is the necessity of tangle for package vignettes. After we finish this issue, I'll be happy to extend the discussion towards tangle in general. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker wrote: > > > > On Sat, May 31, 2014 at 6:54 PM, Yihui Xie wrote: > >> I agree that fully evaluating the code is valuable, but >> it is not a problem since the weave functions do fully evaluate the >> code. If there is a reason for why source() an R script is preferred, >> >> I guess it is users' familiarity with .R instead of .Rnw/.Rmd/..., > > > It's because .Rnw and Rmd require more from the user than .R. Also, this > started with vignettes but you seem to be talking more generally. If so, I > would point out that not all R code is intended to generate reports, and > writing pure R code that isn't going to generate a report in an .Rnw/.Rmd > file would be very strange to say the least. > > >> >> however, I guess it would be painful to read the pure R script tangled >> from the source document without the original narratives. > > > That depends a lot on what you want. Reading an woven article/report that > includes code and reading code are different and equally valid activities. > Sometimes I really just want to know what the author actually told the > computer to do. > >> >> >> So what do we really lose if we turn off tangle? We lose an R script >> as a derivative from the source document, but we do not lose the code >> evaluation. > > > We lose *isolated* code evaluation. Sweave/knit have a lot more moving > pieces than source/eval do. Many of which are for the purpose of displaying > output, rather than running code. > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel