Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Carl Boettiger
Yihui, list,

Focusing the behavior of R CMD check, the only reason I have seen put
forward in the discussion for having check tangle and then source as well
as knit/weave the very same vignette is to assist the package maintainer in
debugging R errors vs pdflatex errors.  As tangle (and many other tools)
are already available to an author needing extra help debugging, and as the
error messages are usually clear on whether errors come from the R code or
whatever format compiling (pdflatex, markdown html, etc), this seems like a
poor reason for R CMD check to be wasting time doing two versions of almost
(but not literally) the same check.

As has already been discussed, it is possible to write vignettes that can
be Sweave'd but not source'd, due to the different treatments of inline
chunks.  While I see the advantages of this property, I don't see why R CMD
check should be enforcing it through the arbitrary mechanism of running
both Sweave and tangle+source. If that is the desired behavior for all
Sweave documents it should be in part of the Sweave specification not to be
able to write/change values in inline expressions, or part of the tangle
definition to include inline chunks.  I any event I don't see any reason
for R CMD check doing both.  Perhaps someone can fill in whatever I've
overlooked?

Carl


On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:

> 1. The starting point of this discussion is package vignettes, instead
> of R scripts. I'm not saying we should abandon R scripts, or all
> people should write R code to generate reports. Starting from a
> package vignette, you can evaluate it using a weave function, or
> evaluate its derivative, namely an R script. I was saying the former
> might not be a bad idea, although the latter sounds more familiar to
> most R users. For a package vignette, within the context of R CMD
> check, is it necessary to do tangle + evaluate _besides_ weave?
>
> 2. If you are comfortable with reading pure code without narratives,
> I'm totally fine with that. I guess there is nothing to argue on this
> point, since it is pretty much personal taste.
>
> 3. Yes, you are absolutely correct -- Sweave()/knit() does more than
> source(), but let me repeat the issue to be discussed: what harm does
> it bring if we disable tangle for R package vignettes?
>
> Sorry if I did not make it clear enough, my priority of this
> discussion is the necessity of tangle for package vignettes. After we
> finish this issue, I'll be happy to extend the discussion towards
> tangle in general.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker 
> wrote:
> >
> >
> >
> > On Sat, May 31, 2014 at 6:54 PM, Yihui Xie  wrote:
> >
> >> I agree that fully evaluating the code is valuable, but
> >> it is not a problem since the weave functions do fully evaluate the
> >> code. If there is a reason for why source() an R script is preferred,
> >>
> >> I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
> >
> >
> > It's because .Rnw and Rmd require more from the user than .R. Also, this
> > started with vignettes but you seem to be talking more generally. If so,
> I
> > would point out that not all R code is intended to generate reports, and
> > writing pure R code that isn't going to generate a report in an .Rnw/.Rmd
> > file would be very strange to say the least.
> >
> >
> >>
> >> however, I guess it would be painful to read the pure R script tangled
> >> from the source document without the original narratives.
> >
> >
> > That depends a lot on what you want. Reading an woven article/report that
> > includes code and reading code are different and equally valid
> activities.
> > Sometimes I really just want to know what the author actually told the
> > computer to do.
> >
> >>
> >>
> >> So what do we really lose if we turn off tangle? We lose an R script
> >> as a derivative from the source document, but we do not lose the code
> >> evaluation.
> >
> >
> > We lose *isolated* code evaluation. Sweave/knit have a lot more moving
> > pieces than source/eval do. Many of which are  for the purpose of
> displaying
> > output, rather than running code.
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Gabriel Becker
Carl,

I don't really have a horse in this race other than a strong feeling that
whatever check does should be mandatory.

That having been said, I think it can be argued that the fact that check
does this means that it IS in the R package vignette specification that all
vignettes must be such that their tangled code will run without errors.

~G


On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger  wrote:

> Yihui, list,
>
> Focusing the behavior of R CMD check, the only reason I have seen put
> forward in the discussion for having check tangle and then source as well
> as knit/weave the very same vignette is to assist the package maintainer in
> debugging R errors vs pdflatex errors.  As tangle (and many other tools)
> are already available to an author needing extra help debugging, and as the
> error messages are usually clear on whether errors come from the R code or
> whatever format compiling (pdflatex, markdown html, etc), this seems like a
> poor reason for R CMD check to be wasting time doing two versions of almost
> (but not literally) the same check.
>
> As has already been discussed, it is possible to write vignettes that can
> be Sweave'd but not source'd, due to the different treatments of inline
> chunks.  While I see the advantages of this property, I don't see why R CMD
> check should be enforcing it through the arbitrary mechanism of running
> both Sweave and tangle+source. If that is the desired behavior for all
> Sweave documents it should be in part of the Sweave specification not to be
> able to write/change values in inline expressions, or part of the tangle
> definition to include inline chunks.  I any event I don't see any reason
> for R CMD check doing both.  Perhaps someone can fill in whatever I've
> overlooked?
>
> Carl
>
>
> On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:
>
>> 1. The starting point of this discussion is package vignettes, instead
>> of R scripts. I'm not saying we should abandon R scripts, or all
>> people should write R code to generate reports. Starting from a
>> package vignette, you can evaluate it using a weave function, or
>> evaluate its derivative, namely an R script. I was saying the former
>> might not be a bad idea, although the latter sounds more familiar to
>> most R users. For a package vignette, within the context of R CMD
>> check, is it necessary to do tangle + evaluate _besides_ weave?
>>
>> 2. If you are comfortable with reading pure code without narratives,
>> I'm totally fine with that. I guess there is nothing to argue on this
>> point, since it is pretty much personal taste.
>>
>> 3. Yes, you are absolutely correct -- Sweave()/knit() does more than
>> source(), but let me repeat the issue to be discussed: what harm does
>> it bring if we disable tangle for R package vignettes?
>>
>> Sorry if I did not make it clear enough, my priority of this
>> discussion is the necessity of tangle for package vignettes. After we
>> finish this issue, I'll be happy to extend the discussion towards
>> tangle in general.
>>
>> Regards,
>> Yihui
>> --
>> Yihui Xie 
>> Web: http://yihui.name
>>
>>
>> On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker 
>> wrote:
>> >
>> >
>> >
>> > On Sat, May 31, 2014 at 6:54 PM, Yihui Xie  wrote:
>> >
>> >> I agree that fully evaluating the code is valuable, but
>> >> it is not a problem since the weave functions do fully evaluate the
>> >> code. If there is a reason for why source() an R script is preferred,
>> >>
>> >> I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
>> >
>> >
>> > It's because .Rnw and Rmd require more from the user than .R. Also, this
>> > started with vignettes but you seem to be talking more generally. If
>> so, I
>> > would point out that not all R code is intended to generate reports, and
>> > writing pure R code that isn't going to generate a report in an
>> .Rnw/.Rmd
>> > file would be very strange to say the least.
>> >
>> >
>> >>
>> >> however, I guess it would be painful to read the pure R script tangled
>> >> from the source document without the original narratives.
>> >
>> >
>> > That depends a lot on what you want. Reading an woven article/report
>> that
>> > includes code and reading code are different and equally valid
>> activities.
>> > Sometimes I really just want to know what the author actually told the
>> > computer to do.
>> >
>> >>
>> >>
>> >> So what do we really lose if we turn off tangle? We lose an R script
>> >> as a derivative from the source document, but we do not lose the code
>> >> evaluation.
>> >
>> >
>> > We lose *isolated* code evaluation. Sweave/knit have a lot more moving
>> > pieces than source/eval do. Many of which are  for the purpose of
>> displaying
>> > output, rather than running code.
>> >
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Carl Boettiger
> UC Santa Cruz
> http://carlboettiger.info/
>



-- 
Gabriel Becker
Graduate Student
S

Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Carl Boettiger
Thanks both for the replies.

Duncan, I'm sorry if I wasn't clear.  I am indeed writing a vignette using
Sweave (knitr actually), and I want it to be a vignette. I'm well aware
that I can dodge these tests as you suggest, or through other ways, but I'm
not trying to dodge them.  R CMD check is running both knit and
tangle+source on it, and I do not understand why the latter is necessary
when the code is already run by the former.  Is there a good reason for
checking an R vignette in this seemingly redundant fashion?

Gabe, I see your point but surely you can agree that is a rather obtuse way
to enforce that behavior. I don't recall seeing anything in the R
extensions manual documenting that Sweave files must meet this constraint
in order to be considered valid vignettes.  I also believe there are valid
use cases for side-effects of inline chunk options (my example being
dynamic references).  While it is easy to hack a vignette to meet this
constraint (e.g. replicating inline calls with non-displayed chunk), that
seems poor form.

I think Yihui has made a good case that there is no reason for R CMD check
to be running weave/knit and source, and I haven't seen any replies trying
to explain to the contrary why this is a reasonable thing for the automated
check to be doing.

Cheers,

Carl


On Sun, Jun 1, 2014 at 9:16 PM, Gabriel Becker  wrote:

> Carl,
>
> I don't really have a horse in this race other than a strong feeling that
> whatever check does should be mandatory.
>
> That having been said, I think it can be argued that the fact that check
> does this means that it IS in the R package vignette specification that all
> vignettes must be such that their tangled code will run without errors.
>
> ~G
>
>
> On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger  wrote:
>
>> Yihui, list,
>>
>> Focusing the behavior of R CMD check, the only reason I have seen put
>> forward in the discussion for having check tangle and then source as well
>> as knit/weave the very same vignette is to assist the package maintainer in
>> debugging R errors vs pdflatex errors.  As tangle (and many other tools)
>> are already available to an author needing extra help debugging, and as the
>> error messages are usually clear on whether errors come from the R code or
>> whatever format compiling (pdflatex, markdown html, etc), this seems like a
>> poor reason for R CMD check to be wasting time doing two versions of almost
>> (but not literally) the same check.
>>
>> As has already been discussed, it is possible to write vignettes that can
>> be Sweave'd but not source'd, due to the different treatments of inline
>> chunks.  While I see the advantages of this property, I don't see why R CMD
>> check should be enforcing it through the arbitrary mechanism of running
>> both Sweave and tangle+source. If that is the desired behavior for all
>> Sweave documents it should be in part of the Sweave specification not to be
>> able to write/change values in inline expressions, or part of the tangle
>> definition to include inline chunks.  I any event I don't see any reason
>> for R CMD check doing both.  Perhaps someone can fill in whatever I've
>> overlooked?
>>
>> Carl
>>
>>
>> On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:
>>
>>> 1. The starting point of this discussion is package vignettes, instead
>>> of R scripts. I'm not saying we should abandon R scripts, or all
>>> people should write R code to generate reports. Starting from a
>>> package vignette, you can evaluate it using a weave function, or
>>> evaluate its derivative, namely an R script. I was saying the former
>>> might not be a bad idea, although the latter sounds more familiar to
>>> most R users. For a package vignette, within the context of R CMD
>>> check, is it necessary to do tangle + evaluate _besides_ weave?
>>>
>>> 2. If you are comfortable with reading pure code without narratives,
>>> I'm totally fine with that. I guess there is nothing to argue on this
>>> point, since it is pretty much personal taste.
>>>
>>> 3. Yes, you are absolutely correct -- Sweave()/knit() does more than
>>> source(), but let me repeat the issue to be discussed: what harm does
>>> it bring if we disable tangle for R package vignettes?
>>>
>>> Sorry if I did not make it clear enough, my priority of this
>>> discussion is the necessity of tangle for package vignettes. After we
>>> finish this issue, I'll be happy to extend the discussion towards
>>> tangle in general.
>>>
>>> Regards,
>>> Yihui
>>> --
>>> Yihui Xie 
>>> Web: http://yihui.name
>>>
>>>
>>> On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker 
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Sat, May 31, 2014 at 6:54 PM, Yihui Xie  wrote:
>>> >
>>> >> I agree that fully evaluating the code is valuable, but
>>> >> it is not a problem since the weave functions do fully evaluate the
>>> >> code. If there is a reason for why source() an R script is preferred,
>>> >>
>>> >> I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
>>> >
>>> >
>>

Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Duncan Murdoch

On 02/06/2014, 1:41 PM, Carl Boettiger wrote:

Thanks both for the replies.

Duncan, I'm sorry if I wasn't clear.  I am indeed writing a vignette using
Sweave (knitr actually), and I want it to be a vignette. I'm well aware
that I can dodge these tests as you suggest, or through other ways, but I'm
not trying to dodge them.  R CMD check is running both knit and
tangle+source on it, and I do not understand why the latter is necessary
when the code is already run by the former.  Is there a good reason for
checking an R vignette in this seemingly redundant fashion?

Gabe, I see your point but surely you can agree that is a rather obtuse way
to enforce that behavior. I don't recall seeing anything in the R
extensions manual documenting that Sweave files must meet this constraint
in order to be considered valid vignettes.  I also believe there are valid
use cases for side-effects of inline chunk options (my example being
dynamic references).  While it is easy to hack a vignette to meet this
constraint (e.g. replicating inline calls with non-displayed chunk), that
seems poor form.

I think Yihui has made a good case that there is no reason for R CMD check
to be running weave/knit and source, and I haven't seen any replies trying
to explain to the contrary why this is a reasonable thing for the automated
check to be doing.


You haven't been reading very carefully.  I saw several:  mine, Martin 
Morgan's, Kasper Daniel Hansen's.


Duncan Murdoch


Cheers,

Carl


On Sun, Jun 1, 2014 at 9:16 PM, Gabriel Becker  wrote:


Carl,

I don't really have a horse in this race other than a strong feeling that
whatever check does should be mandatory.

That having been said, I think it can be argued that the fact that check
does this means that it IS in the R package vignette specification that all
vignettes must be such that their tangled code will run without errors.

~G


On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger  wrote:


Yihui, list,

Focusing the behavior of R CMD check, the only reason I have seen put
forward in the discussion for having check tangle and then source as well
as knit/weave the very same vignette is to assist the package maintainer in
debugging R errors vs pdflatex errors.  As tangle (and many other tools)
are already available to an author needing extra help debugging, and as the
error messages are usually clear on whether errors come from the R code or
whatever format compiling (pdflatex, markdown html, etc), this seems like a
poor reason for R CMD check to be wasting time doing two versions of almost
(but not literally) the same check.

As has already been discussed, it is possible to write vignettes that can
be Sweave'd but not source'd, due to the different treatments of inline
chunks.  While I see the advantages of this property, I don't see why R CMD
check should be enforcing it through the arbitrary mechanism of running
both Sweave and tangle+source. If that is the desired behavior for all
Sweave documents it should be in part of the Sweave specification not to be
able to write/change values in inline expressions, or part of the tangle
definition to include inline chunks.  I any event I don't see any reason
for R CMD check doing both.  Perhaps someone can fill in whatever I've
overlooked?

Carl


On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:


1. The starting point of this discussion is package vignettes, instead
of R scripts. I'm not saying we should abandon R scripts, or all
people should write R code to generate reports. Starting from a
package vignette, you can evaluate it using a weave function, or
evaluate its derivative, namely an R script. I was saying the former
might not be a bad idea, although the latter sounds more familiar to
most R users. For a package vignette, within the context of R CMD
check, is it necessary to do tangle + evaluate _besides_ weave?

2. If you are comfortable with reading pure code without narratives,
I'm totally fine with that. I guess there is nothing to argue on this
point, since it is pretty much personal taste.

3. Yes, you are absolutely correct -- Sweave()/knit() does more than
source(), but let me repeat the issue to be discussed: what harm does
it bring if we disable tangle for R package vignettes?

Sorry if I did not make it clear enough, my priority of this
discussion is the necessity of tangle for package vignettes. After we
finish this issue, I'll be happy to extend the discussion towards
tangle in general.

Regards,
Yihui
--
Yihui Xie 
Web: http://yihui.name


On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker 
wrote:




On Sat, May 31, 2014 at 6:54 PM, Yihui Xie  wrote:


I agree that fully evaluating the code is valuable, but
it is not a problem since the weave functions do fully evaluate the
code. If there is a reason for why source() an R script is preferred,

I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,



It's because .Rnw and Rmd require more from the user than .R. Also,

this

started with vignettes but you seem