Re: [Rd] Where does L come from?

2018-08-25 Thread Carl Boettiger
I always thought it meant "Long" (I'm assuming R's integers are long
integers in C sense (iirrc one can declare 'long x', and it being common to
refer to integers as "longs"  in the same way we use "doubles" to mean
double precision floating point).  But pure speculation on my part, so I'm
curious!

On Sat, Aug 25, 2018 at 6:50 AM Henrik Bengtsson 
wrote:

> Not that it brings closure, but there's also
> https://stat.ethz.ch/pipermail/r-devel/2017-June/074462.html
>
> Henrik
>
> On Sat, Aug 25, 2018, 06:40 Marc Schwartz via R-devel <
> r-devel@r-project.org>
> wrote:
>
> > On Aug 25, 2018, at 9:26 AM, Hadley Wickham  wrote:
> > >
> > > Hi all,
> > >
> > > Would someone mind pointing to me to the inspiration for the use of
> > > the L suffix to mean "integer"?  This is obviously hard to google for,
> > > and the R language definition
> > > (
> https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Constants)
> > > is silent.
> > >
> > > Hadley
> >
> >
> > The link you have above, does reference the use of 'L', but not the
> > derivation.
> >
> > There is a thread on R-Help from 2012 ("Difference between 10 and 10L"),
> > where Prof. Ripley addresses the issue in response to Bill Dunlap and the
> > OP:
> >
> >   https://stat.ethz.ch/pipermail/r-help/2012-May/311771.html
> >
> > In searching, I also found the following thread on SO:
> >
> >
> >
> https://stackoverflow.com/questions/22191324/clarification-of-l-in-r/22192378
> >
> > which had a link to the R-Help thread above and others.
> >
> > Regards,
> >
> > Marc Schwartz
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 

http://carlboettiger.info

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bias in R's random integers?

2018-09-19 Thread Carl Boettiger
Dear list,

It looks to me that R samples random integers using an intuitive but biased
algorithm by going from a random number on [0,1) from the PRNG to a random
integer, e.g.
https://github.com/wch/r-source/blob/tags/R-3-5-1/src/main/RNG.c#L808

Many other languages use various rejection sampling approaches which
provide an unbiased method for sampling, such as in Go, python, and others
described here:  https://arxiv.org/abs/1805.10941 (I believe the biased
algorithm currently used in R is also described there).  I'm not an expert
in this area, but does it make sense for the R to adopt one of the unbiased
random sample algorithms outlined there and used in other languages?  Would
a patch providing such an algorithm be welcome? What concerns would need to
be addressed first?

I believe this issue was also raised by Killie & Philip in
http://r.789695.n4.nabble.com/Bug-in-sample-td4729483.html, and more
recently in
https://www.stat.berkeley.edu/~stark/Preprints/r-random-issues.pdf,
pointing to the python implementation for comparison:
https://github.com/statlab/cryptorandom/blob/master/cryptorandom/cryptorandom.py#L265

Thanks!

Carl
-- 

http://carlboettiger.info

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bias in R's random integers?

2018-09-19 Thread Carl Boettiger
t;the unif_rand() outcomes".  Call
> >> the
> >> values from sample() the sample outcomes.
> >>
> >> It would be easiest to see the error if half of the sample()
> outcomes
> >> used two unif_rand() outcomes, and half used just one.  That would
> >> mean
> >> m should be (2/3) * 2^32, but that's too big and would trigger the
> >> other
> >> version.
> >>
> >> So how about half use 2 unif_rands(), and half use 3?  That means m
> =
> >> (2/5) * 2^32 = 1717986918.  A good guess is that sample() outcomes
> >> would
> >> alternate between the two possibilities, so our event could be even
> >> versus odd outcomes.
> >>
> >> Let's try it:
> >>
> >>   > m <- (2/5)*2^32
> >>   > m > 2^31
> >> [1] FALSE
> >>   > x <- sample(m, 100, replace = TRUE)
> >>   > table(x %% 2)
> >>
> >>0  1
> >> 399850 600150
> >>
> >> Since m is an even number, the true proportions of evens and odds
> >> should
> >> be exactly 0.5.  That's some pretty strong evidence of the bug in
> the
> >> generator.  (Note that the ratio of the observed probabilities is
> >> about
> >> 1.5, so I may not be the first person to have done this.)
> >>
> >> I'm still not convinced that there has ever been a simulation run
> >> with
> >> detectable bias compared to Monte Carlo error unless it (like this
> >> one)
> >> was designed specifically to show the problem.
> >>
> >> Duncan Murdoch
> >>
> >>  >
> >>  > (RNG.c, lines 793ff)
> >>  >
> >>  > double R_unif_index(double dn)
> >>  > {
> >>  >  double cut = INT_MAX;
> >>  >
> >>  >  switch(RNG_kind) {
> >>  >  case KNUTH_TAOCP:
> >>  >  case USER_UNIF:
> >>  >  case KNUTH_TAOCP2:
> >>  > cut = 33554431.0; /* 2^25 - 1 */
> >>  > break;
> >>  >  default:
> >>  > break;
> >>  > }
> >>  >
> >>  >  double u = dn > cut ? ru() : unif_rand();
> >>  >  return floor(dn * u);
> >>  > }
> >>  >
> >>  > On Wed, Sep 19, 2018 at 9:20 AM Duncan Murdoch
> >> mailto:murdoch.dun...@gmail.com>
> >>  > <mailto:murdoch.dun...@gmail.com
> >> <mailto:murdoch.dun...@gmail.com>>> wrote:
> >>  >
> >>  > On 19/09/2018 12:09 PM, Philip B. Stark wrote:
> >>  >  > The 53 bits only encode at most 2^{32} possible values,
> >> because the
> >>  >  > source of the float is the output of a 32-bit PRNG (the
> >> obsolete
> >>  > version
> >>  >  > of MT). 53 bits isn't the relevant number here.
> >>  >
> >>  > No, two calls to unif_rand() are used.  There are two 32 bit
> >> values,
> >>  > but
> >>  > some of the bits are thrown away.
> >>  >
> >>  > Duncan Murdoch
> >>  >
> >>  >  >
> >>  >  > The selection ratios can get close to 2. Computer
> >> scientists
> >>  > don't do it
> >>  >  > the way R does, for a reason.
> >>  >  >
> >>  >  > Regards,
> >>  >  > Philip
> >>  >  >
> >>  >  > On Wed, Sep 19, 2018 at 9:05 AM Duncan Murdoch
> >>  > mailto:murdoch.dun...@gmail.com>
> >> <mailto:murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>>
> >>  >  > <mailto:murdoch.dun...@gmail.com
> >> <mailto:murdoch.dun...@gmail.com>
> >>  > <mailto:murdoch.dun...@gmail.com
> >> <mailto:murdoch.dun...@gmail.com>>>> wrote:
> >>  >  >
> >>  >  > On 19/09/2018 9:09 AM, Iñaki Ucar wrote:
> >>  >  >  > El mié., 19 sept. 2018 a las 14:43, Duncan Murdoch
> >>  >  >  > ( >> <mailto:murdoch.dun

[Rd] Correct use of tools::R_user_dir() in packages?

2023-06-27 Thread Carl Boettiger
tools::R_user_dir() provides configurable directories for R packages
to write persistent information consistent with standard best
practices relative to each supported operating systems for
applications to store data, config, and cache information
respectively.  These standard best practices include writing to
directories in the users home filespace, which is also specifically
against CRAN policy.

These defaults can be overridden by setting the environmental
variables R_USER_DATA_DIR , R_USER_CONFIG_DIR, R_USER_CACHE_DIR,
respectively.

If R developers should be using the locations provided by
tools::R_user_dir() in packages, why does CRAN's check procedure not
set these three environmental variables to CRAN compliant location by
default (e.g. tempdir())?

In order to comply with CRAN policy, a package developer can obviously
set these environmental variables themselves within the call for every
example, every unit test, and every vignette.  Is this the recommended
approach or is there a better technique?

Thanks for any clarification!

Regards,

Carl

---
Carl Boettiger
http://carlboettiger.info/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Correct use of tools::R_user_dir() in packages?

2023-06-28 Thread Carl Boettiger
Thanks Simon, I was very much hoping that would be the case!  It may
be that I just need to put the version requirement on 4.0 then.  I
will be sure to add this version restriction to my packages (which
technically I should be doing anyway since this function didn't exist
in early versions of `tools`.)

Cheers,

Carl

---
Carl Boettiger
http://carlboettiger.info/

On Wed, Jun 28, 2023 at 12:59 PM Simon Urbanek
 wrote:
>
> Carl,
>
> I think your statement is false, the whole point of R_user_dir() is for 
> packages to have a well-defined location that is allowed - from CRAN policy:
>
> "For R version 4.0 or later (hence a version dependency is required or only 
> conditional use is possible), packages may store user-specific data, 
> configuration and cache files in their respective user directories obtained 
> from tools::R_user_dir(), provided that by default sizes are kept as small as 
> possible and the contents are actively managed (including removing outdated 
> material)."
>
> Cheers,
> Simon
>
>
> > On 28/06/2023, at 10:36 AM, Carl Boettiger  wrote:
> >
> > tools::R_user_dir() provides configurable directories for R packages
> > to write persistent information consistent with standard best
> > practices relative to each supported operating systems for
> > applications to store data, config, and cache information
> > respectively.  These standard best practices include writing to
> > directories in the users home filespace, which is also specifically
> > against CRAN policy.
> >
> > These defaults can be overridden by setting the environmental
> > variables R_USER_DATA_DIR , R_USER_CONFIG_DIR, R_USER_CACHE_DIR,
> > respectively.
> >
> > If R developers should be using the locations provided by
> > tools::R_user_dir() in packages, why does CRAN's check procedure not
> > set these three environmental variables to CRAN compliant location by
> > default (e.g. tempdir())?
> >
> > In order to comply with CRAN policy, a package developer can obviously
> > set these environmental variables themselves within the call for every
> > example, every unit test, and every vignette.  Is this the recommended
> > approach or is there a better technique?
> >
> > Thanks for any clarification!
> >
> > Regards,
> >
> > Carl
> >
> > ---
> > Carl Boettiger
> > http://carlboettiger.info/
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Correct use of tools::R_user_dir() in packages?

2023-06-29 Thread Carl Boettiger
Thanks Iñaki, that has been my experience as well.  But that is also
not how I read the current policy that Simon said.  First, my
understanding of the current policy is that in general, 'cleaning up'
is not consistent with the policy of not writing to home, hence my
suggestion of overriding the default locations by setting the env vars
R_USER_DATA_DIR etc to tmpdir() in testing mode.  Of course this could
be done globally on the CRAN testing machines, but I gather that is
not the case.   Second, having re-read the policy, it does seem to say
quite clearly that using R_user_dir() is an exception to the 'thou
shalt never write to $HOME' rule (provided you restrict to version
>=4.0).  In my experience of both my own packages and what I see among
CRAN packages which tools::R_user_dir(), I would venture to say that
this policy and how it relates to the $HOME rule for
tests/examples/vignettes is maybe not as clear as it might be.
Unfortunately this rule does not seem to be covered by explicit code
in the `R CMD check` routine, so we cannot consult the source code to
get a more definitive answer about questions like whether clean-up or
use of temp files is or is not required by policy, though maybe this
is mostly my own confusion.  The examples in this thread have
definitely been helpful to me in understanding how others handle
persistent data/config/cache mechanisms.

Regards,

Carl


---
Carl Boettiger
http://carlboettiger.info/

On Thu, Jun 29, 2023 at 1:17 AM Iñaki Ucar  wrote:
>
> On Thu, 29 Jun 2023 at 01:34, Carl Boettiger  wrote:
> >
> > Thanks Simon, I was very much hoping that would be the case!  It may
> > be that I just need to put the version requirement on 4.0 then.  I
> > will be sure to add this version restriction to my packages (which
> > technically I should be doing anyway since this function didn't exist
> > in early versions of `tools`.)
>
> In my experience, you *can* store stuff in those directories, but you
> are required to clean up after yourself in CRAN checks. In other
> words, if something is left behind when the check ends, CRAN won't be
> happy.
>
> Iñaki
>
> >
> > Cheers,
> >
> > Carl
> >
> > ---
> > Carl Boettiger
> > http://carlboettiger.info/
> >
> > On Wed, Jun 28, 2023 at 12:59 PM Simon Urbanek
> >  wrote:
> > >
> > > Carl,
> > >
> > > I think your statement is false, the whole point of R_user_dir() is for 
> > > packages to have a well-defined location that is allowed - from CRAN 
> > > policy:
> > >
> > > "For R version 4.0 or later (hence a version dependency is required or 
> > > only conditional use is possible), packages may store user-specific data, 
> > > configuration and cache files in their respective user directories 
> > > obtained from tools::R_user_dir(), provided that by default sizes are 
> > > kept as small as possible and the contents are actively managed 
> > > (including removing outdated material)."
> > >
> > > Cheers,
> > > Simon
> > >
> > >
> > > > On 28/06/2023, at 10:36 AM, Carl Boettiger  wrote:
> > > >
> > > > tools::R_user_dir() provides configurable directories for R packages
> > > > to write persistent information consistent with standard best
> > > > practices relative to each supported operating systems for
> > > > applications to store data, config, and cache information
> > > > respectively.  These standard best practices include writing to
> > > > directories in the users home filespace, which is also specifically
> > > > against CRAN policy.
> > > >
> > > > These defaults can be overridden by setting the environmental
> > > > variables R_USER_DATA_DIR , R_USER_CONFIG_DIR, R_USER_CACHE_DIR,
> > > > respectively.
> > > >
> > > > If R developers should be using the locations provided by
> > > > tools::R_user_dir() in packages, why does CRAN's check procedure not
> > > > set these three environmental variables to CRAN compliant location by
> > > > default (e.g. tempdir())?
> > > >
> > > > In order to comply with CRAN policy, a package developer can obviously
> > > > set these environmental variables themselves within the call for every
> > > > example, every unit test, and every vignette.  Is this the recommended
> > > > approach or is there a better technique?
> > > >
> > > > Thanks for any clarification!
> > > >
> > > > Regards,
> > > >
> > > > Carl
> > > >
> > > > ---
> > > > Carl Boettiger
> > > > http://carlboettiger.info/
> > > >
> > > > __
> > > > R-devel@r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > > >
> > >
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Úcar

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Carl Boettiger
for each
>> script you want to reproduce. No ambiguity about which package versions
>> are
>> used by R 3.0. However for better or worse, I think this could only be
>> accomplished with a cran release cycle (i.e. "universal snapshots")
>> accompanying the already existing r releases.
>>
>>
>>
>>  The only objection I can see to this is that it requires extra work by
>>> the
>>> third party, rather than extra work by the CRAN team. I don't think the
>>> total amount of work required is much different.  I'm very unsympathetic
>>> to
>>> proposals to dump work on others.
>>>
>>
>> I am merely trying to discuss a technical issue in an attempt to improve
>> reliability of our software and reproducibility of papers created with R.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The case for freezing CRAN

2014-03-20 Thread Carl Boettiger
this and 10,000 was a low estimate of a lower bound
> of one set of simulations) at which point they would admit that I had
> a case and then send me to talk to someone else who would start the
> process over.
>
>
>
> [snip]
> > --
> > Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check for the R code from vignettes

2014-05-30 Thread Carl Boettiger
 weaving process. If this is done, I'm back to my
>> previous question: does it make sense to run the code twice?
>>
>> To push this a little further, personally I do not quite appreciate
>> literate programming in R as two separate steps, namely weave and
>> tangle. In particular, I do not see the value of tangle, considering
>> Sweave() (or knitr::knit()) as the new "source()". Therefore
>> eventually I tend to just drop tangle, but perhaps I missed something
>> here, and I'd like to hear what other people think about it.
>>
>> Regards,
>> Yihui
>> --
>> Yihui Xie 
>> Web: http://yihui.name
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Carl Boettiger
Yihui, list,

Focusing the behavior of R CMD check, the only reason I have seen put
forward in the discussion for having check tangle and then source as well
as knit/weave the very same vignette is to assist the package maintainer in
debugging R errors vs pdflatex errors.  As tangle (and many other tools)
are already available to an author needing extra help debugging, and as the
error messages are usually clear on whether errors come from the R code or
whatever format compiling (pdflatex, markdown html, etc), this seems like a
poor reason for R CMD check to be wasting time doing two versions of almost
(but not literally) the same check.

As has already been discussed, it is possible to write vignettes that can
be Sweave'd but not source'd, due to the different treatments of inline
chunks.  While I see the advantages of this property, I don't see why R CMD
check should be enforcing it through the arbitrary mechanism of running
both Sweave and tangle+source. If that is the desired behavior for all
Sweave documents it should be in part of the Sweave specification not to be
able to write/change values in inline expressions, or part of the tangle
definition to include inline chunks.  I any event I don't see any reason
for R CMD check doing both.  Perhaps someone can fill in whatever I've
overlooked?

Carl


On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:

> 1. The starting point of this discussion is package vignettes, instead
> of R scripts. I'm not saying we should abandon R scripts, or all
> people should write R code to generate reports. Starting from a
> package vignette, you can evaluate it using a weave function, or
> evaluate its derivative, namely an R script. I was saying the former
> might not be a bad idea, although the latter sounds more familiar to
> most R users. For a package vignette, within the context of R CMD
> check, is it necessary to do tangle + evaluate _besides_ weave?
>
> 2. If you are comfortable with reading pure code without narratives,
> I'm totally fine with that. I guess there is nothing to argue on this
> point, since it is pretty much personal taste.
>
> 3. Yes, you are absolutely correct -- Sweave()/knit() does more than
> source(), but let me repeat the issue to be discussed: what harm does
> it bring if we disable tangle for R package vignettes?
>
> Sorry if I did not make it clear enough, my priority of this
> discussion is the necessity of tangle for package vignettes. After we
> finish this issue, I'll be happy to extend the discussion towards
> tangle in general.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker 
> wrote:
> >
> >
> >
> > On Sat, May 31, 2014 at 6:54 PM, Yihui Xie  wrote:
> >
> >> I agree that fully evaluating the code is valuable, but
> >> it is not a problem since the weave functions do fully evaluate the
> >> code. If there is a reason for why source() an R script is preferred,
> >>
> >> I guess it is users' familiarity with .R instead of .Rnw/.Rmd/...,
> >
> >
> > It's because .Rnw and Rmd require more from the user than .R. Also, this
> > started with vignettes but you seem to be talking more generally. If so,
> I
> > would point out that not all R code is intended to generate reports, and
> > writing pure R code that isn't going to generate a report in an .Rnw/.Rmd
> > file would be very strange to say the least.
> >
> >
> >>
> >> however, I guess it would be painful to read the pure R script tangled
> >> from the source document without the original narratives.
> >
> >
> > That depends a lot on what you want. Reading an woven article/report that
> > includes code and reading code are different and equally valid
> activities.
> > Sometimes I really just want to know what the author actually told the
> > computer to do.
> >
> >>
> >>
> >> So what do we really lose if we turn off tangle? We lose an R script
> >> as a derivative from the source document, but we do not lose the code
> >> evaluation.
> >
> >
> > We lose *isolated* code evaluation. Sweave/knit have a lot more moving
> > pieces than source/eval do. Many of which are  for the purpose of
> displaying
> > output, rather than running code.
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check for the R code from vignettes

2014-06-01 Thread Carl Boettiger
Thanks both for the replies.

Duncan, I'm sorry if I wasn't clear.  I am indeed writing a vignette using
Sweave (knitr actually), and I want it to be a vignette. I'm well aware
that I can dodge these tests as you suggest, or through other ways, but I'm
not trying to dodge them.  R CMD check is running both knit and
tangle+source on it, and I do not understand why the latter is necessary
when the code is already run by the former.  Is there a good reason for
checking an R vignette in this seemingly redundant fashion?

Gabe, I see your point but surely you can agree that is a rather obtuse way
to enforce that behavior. I don't recall seeing anything in the R
extensions manual documenting that Sweave files must meet this constraint
in order to be considered valid vignettes.  I also believe there are valid
use cases for side-effects of inline chunk options (my example being
dynamic references).  While it is easy to hack a vignette to meet this
constraint (e.g. replicating inline calls with non-displayed chunk), that
seems poor form.

I think Yihui has made a good case that there is no reason for R CMD check
to be running weave/knit and source, and I haven't seen any replies trying
to explain to the contrary why this is a reasonable thing for the automated
check to be doing.

Cheers,

Carl


On Sun, Jun 1, 2014 at 9:16 PM, Gabriel Becker  wrote:

> Carl,
>
> I don't really have a horse in this race other than a strong feeling that
> whatever check does should be mandatory.
>
> That having been said, I think it can be argued that the fact that check
> does this means that it IS in the R package vignette specification that all
> vignettes must be such that their tangled code will run without errors.
>
> ~G
>
>
> On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger  wrote:
>
>> Yihui, list,
>>
>> Focusing the behavior of R CMD check, the only reason I have seen put
>> forward in the discussion for having check tangle and then source as well
>> as knit/weave the very same vignette is to assist the package maintainer in
>> debugging R errors vs pdflatex errors.  As tangle (and many other tools)
>> are already available to an author needing extra help debugging, and as the
>> error messages are usually clear on whether errors come from the R code or
>> whatever format compiling (pdflatex, markdown html, etc), this seems like a
>> poor reason for R CMD check to be wasting time doing two versions of almost
>> (but not literally) the same check.
>>
>> As has already been discussed, it is possible to write vignettes that can
>> be Sweave'd but not source'd, due to the different treatments of inline
>> chunks.  While I see the advantages of this property, I don't see why R CMD
>> check should be enforcing it through the arbitrary mechanism of running
>> both Sweave and tangle+source. If that is the desired behavior for all
>> Sweave documents it should be in part of the Sweave specification not to be
>> able to write/change values in inline expressions, or part of the tangle
>> definition to include inline chunks.  I any event I don't see any reason
>> for R CMD check doing both.  Perhaps someone can fill in whatever I've
>> overlooked?
>>
>> Carl
>>
>>
>> On Sat, May 31, 2014 at 8:17 PM, Yihui Xie  wrote:
>>
>>> 1. The starting point of this discussion is package vignettes, instead
>>> of R scripts. I'm not saying we should abandon R scripts, or all
>>> people should write R code to generate reports. Starting from a
>>> package vignette, you can evaluate it using a weave function, or
>>> evaluate its derivative, namely an R script. I was saying the former
>>> might not be a bad idea, although the latter sounds more familiar to
>>> most R users. For a package vignette, within the context of R CMD
>>> check, is it necessary to do tangle + evaluate _besides_ weave?
>>>
>>> 2. If you are comfortable with reading pure code without narratives,
>>> I'm totally fine with that. I guess there is nothing to argue on this
>>> point, since it is pretty much personal taste.
>>>
>>> 3. Yes, you are absolutely correct -- Sweave()/knit() does more than
>>> source(), but let me repeat the issue to be discussed: what harm does
>>> it bring if we disable tangle for R package vignettes?
>>>
>>> Sorry if I did not make it clear enough, my priority of this
>>> discussion is the necessity of tangle for package vignettes. After we
>>> finish this issue, I'll be happy to extend the discussion towards
>>> tangle in general.
>>>
>>> Regards,
>>&g

Re: [Rd] Preferred way to include internal data in package?

2014-08-04 Thread Carl Boettiger
I would think that putting them in an appropriate subdirectory in inst
might be preferable, e.g. inst/examples, and then reading the data in
with ?system.file where necessary?

(I have frequently been instructed that data ought not be mixed with
code, as it makes both code and data harder to read, maintain, or
reuse.  On the other hand, if there's a good reason to avoid
system.file for this, I'd be happy to be enlightened so that I too
could improve my practices).

Cheers,

Carl

On Mon, Aug 4, 2014 at 9:23 AM, Gábor Csárdi  wrote:
> If you want to keep it as text, then you can just put it in the code:
>
> mydata <-
> '"","mpg","cyl","disp","hp","drat","wt","qsec","vs","am","gear","carb"
> "Mazda RX4",21,6,160,110,3.9,2.62,16.46,0,1,4,4
> "Mazda RX4 Wag",21,6,160,110,3.9,2.875,17.02,0,1,4,4
> "Datsun 710",22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
> '
>
> and then you can read from it via text connections:
>
> read.csv(textConnection(mydata))
>
> Gabor
>
> On Mon, Aug 4, 2014 at 12:11 PM, Keirstead, James E
>  wrote:
>> I saw that, but actually I was wondering if there was a more general method. 
>>  I’d like to use plain text files if I can, instead of Rda files, since 
>> they’re easier to maintain (and it’s a small file).
>>
>> On 4 Aug 2014, at 16:30, Jeroen Ooms  wrote:
>>
>>>> I’m developing a package and would like to include some data sets for 
>>>> internal use only, e.g. configuration parameters for functions.  What is 
>>>> the preferred way of doing this?  If I put them in data/, then R CMD check 
>>>> asks me to document them but I’d prefer it if they floated beneath the 
>>>> surface, without the users awareness.
>>>
>>> Perhaps in sysdata.rda. See "Writing R Extensions".
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is using devtools::release no longer allowed?

2014-08-19 Thread Carl Boettiger
Dirk, listeners,

Perhaps you would consider using omegahat's RHTMLForms instead?

library("RHTMLForms")
forms <- getHTMLFormDescription("http://xmpalantir.wu.ac.at/cransubmit/
")
submit_to_cran <- createFunction(forms[[1]])

Should create a function called "submit_to_cran" with arguments
corresponding to the webform blanks, e.g.

submit_to_cran(name = "packagename", email = "youremail", uploaded_file =
"package.tar.gz", comment = "the optional comment")

(clearly you could fill those details in from the submitting package
description).  I haven't tested this.


Cheers,

Carl




On Tue, Aug 19, 2014 at 11:14 AM, Dirk Eddelbuettel  wrote:

>
> On 19 August 2014 at 19:51, Uwe Ligges wrote:
> | So to all listeners: Future submission by means of the webform, please.
>
> I hereby offer one beer to the first person who shows me how to use
> RSelenium
> to drive the webform from a script.
>
> Dirk
> (who used the webform in textmode earlier today over a ssh connection)
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ggplot2/plyr interaction with latest R-devel?

2014-09-01 Thread Carl Boettiger
Hi Ben,

Just tested this on a fresh ubuntu 14:04 sandbox by using Dirk's docker
image (
https://github.com/eddelbuettel/docker-ubuntu-r/tree/master/add-r-devel-san)
for Rdevel.

> install.packages(c("dplyr", "ggplot2"))
> library("dplyr")
> library("ggplot2")

runs fine for me (though takes a few minutes to compile everything). So it
seems you must have done something to your local environment, but not sure
what.




On Mon, Sep 1, 2014 at 11:42 AM, Ben Bolker  wrote:

>
>   I apologize in advance for not having done more homework in advance,
> but thought I would send this along to see if anyone else was seeing this.
>
>   I am having some sort of ggplot2/plyr/very-recent-R-devel dependency
> issues.
>
>   Just installed
>
> R Under development (unstable) (2014-09-01 r66509) -- "Unsuffered
> Consequences"
>
>  from source.
>
> > packageVersion("ggplot2")
> [1] ‘1.0.0’
> > packageVersion("plyr")
> [1] ‘1.8.1’
>
> library(plyr) works
>
>   but then I get:
>
> library(ggplot2)
> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck =
> vI[[i]]) :
>   namespace ‘plyr’ 1.8.1 is being loaded, but >= 1.7.1 is required
> Error: package or namespace load failed for ‘ggplot2’
>
>   I don't remember when I most recently updated my r-devel (it was
> probably a few months old); nothing in recent commits rings a bell.
>
>   Works fine on R version 3.1.1 (except for a "ggplot2 built under
> 3.2.0 warning").
>
>   Does anyone else see this or is it just something weird about my setup?
>
>   Ben Bolker
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] binary R packages for GNU/Linux

2025-02-10 Thread Carl Boettiger
Great discussion.

Just to note another example I don't think was mentioned -- The r-universe
project also builds binaries for Linux (Ubuntu latest)
https://docs.r-universe.dev/install/binaries.html (as well as other targets
including wasm).  It also provides binaries for Bioconductor and packages
on any git-based version control platform (e.g. GitHub).

R Universe is open source and a top-level project of the R Consortium.

Cheers,

Carl

---
Carl Boettiger
http://carlboettiger.info/


On Mon, Feb 10, 2025 at 5:30 AM Iñaki Ucar  wrote:

> On Mon, 10 Feb 2025 at 14:09, Dirk Eddelbuettel  wrote:
> >
> >
> > On 10 February 2025 at 11:00, Tobias Verbeke wrote:
> > | Another argument to demonstrate the feasibility is the r2u project
> > | (https://github.com/eddelbuettel/r2u). It offers CRAN as Ubuntu
> Binaries, but
> > | in order to build these Ubuntu Binaries it actually makes use of the
> binary R
> > | packages built by PPM. Quoting from
> https://eddelbuettel.github.io/r2u/: "For
> > | the CRAN binaries we  either repackage P3M/RSPM/PPM builds (where
> > | available) or build natively." They cover all CRAN packages. The usage
> of PPM
> > | as a source is, of course, a weakness (in the grand scheme of things),
> but
> > | the point here is about the feasibility of building the packages in a
> > | portable way per version of a particular distribution, architecture
> etc.
> >
> > As you brought this up, allow me to clarify: The re-use (where possible)
> is
> > simply a shortcut "where possible".  Each day when I cover updated
> packages,
> > I hit maybe 5 per cent of packages where for reasons I still cannot
> decipher
> > p3m.dev does not have a binary, so I build those 5 per cent from source.
> > Similarly for the approx 450 BioConductor packages all builds are from
> > source.
> >
> > Rebuilding everything from source "just because we want to" is entirely
> > possible but as it is my time waiting for binaries I currently do not
> force
> > full rebuilds but I easily could. Also note that about 22% of packages
> > contain native code, leaving 78% which are not. Re-use is even simpler
> there
> > as these 78% as they contain only (portable) R processing. So if we
> wanted to
> > compile all native packages for Ubuntu, we could. It is a resourcing
> issue
> > that has not yet been a prioruty for me. Inaki does it for Fedora, Detlef
> > does it for OpenSUSE.
>
> And for completeness, [1] is where we painstakingly* maintain a list
> of system dependencies, [2] is where the daily magic happens for
> keeping track of CRAN, and [3] performs the heavy-lifting and
> publishes an RPM repository with the result.
>
> [1] https://github.com/cran4linux/sysreqs
> [2] https://github.com/cran4linux/cran2copr
> [3] https://copr.fedorainfracloud.org/coprs/iucar/cran
>
> *Because, you know, SystemRequirements.
>
> > The more important point of these package is the full system
> integration. You
> > do get _all_ binary dependencies declared, exactly as a
> distribution-native
> > package (of which Debian/Ubuntu have a bit over 1k) would. Guaranteed.
> > Reliably. Fast. That is a big step up for large deployments, for
> testing, for
> > less experienced users.
> >
> > So thanks for starting a discussion around this as 'we' as a community
> are
> > falling a bit short here.
>
> Indeed, thank you, Tobias.
>
> > One open question is if we could pull something off
> > that works like the Python wheels and offers cross-distro builds, ideally
> > without static linking. Your "CRAN libraries" added to the ld.so path
> may do
> > this. I do not know how feasible / involved this would be so so far I
> > concentrated on doing something simpler -- but feasible and reliable by
> > working exactly as the distribution packages work.
>
> It would be perfectly feasible to maintain sync'ed builds (in terms of
> version) of system dependencies at CRAN-provided (RPM, APT...)
> repositories as compat packages for various distributions, then all
> packages could be built once and shipped everywhere (i.e. cross-distro
> builds). Collaterally, this would increase reproducibility of package
> checks to a certain extent.
>
> I offered my help in these matters in the past, but was kindly
> declined. That hand remains extended.
>
> Best,
> Iñaki
>
> >
> > All that said, thanks for the starting this discussion!
> >
> > Cheers, Dirk
> >
> > --
> > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> --
> Iñaki Úcar
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel