Duncan and others: I was not being careful with my description. This
concerned tests of
version 3.2-8, not yet on CRAN, in which I was trying some size-limiting
measures. My
apologies for not making this clear.
- I feel mild pressure to make the survival package smaller, per CRAN
guidelines, and
shrinking the data appears to be one way to approach that. So a real point of
the query
is my attempts to do so. (I am much more resistant to shrinking the extensive
test suite
or the vignettes.)
- The survival package has a lot of small data sets, and bundling them up
into a
single .rda file does save space, but it causes some issues with data(). The
overall
tarball goes from 7480 to 6100 in size (ls -s).
Terry
On 10/24/20 4:28 AM, Duncan Murdoch wrote:
> On 23/10/2020 9:25 p.m., Therneau, Terry M., Ph.D. via R-devel wrote:
>> I found an issue with the data() command this evening when working on the
>> survival
>> package.
>>
>> 1. I have a lot of data sets in the package, almost all used in at least one
>> vignette,
>> help file, or test. As a space saving measure, I have bundled many of them
>> together,
>> i.e., the file data/cancer.rda contains 19 data sets, many of them small.
>> The resulting
>> file (using xz compression) is quite a bit smaller than the individual ones.
>> (I still get
>> a warning note about size from R CMD check, but I'm no longer 2x the limit.)
>>
>> 2. Consider the lung data set. All of these fail:
>> data(lung)
>> data("lung")
>> data(lung, package="survival")
>>
>> a. The lung.Rd file had \usage{data(lung)}; that error was not caught by
>> R CMD check.
>> (Several other .Rd files as well.)
>>
>> b. In broader examples for teaching, I sometimes load data from other
>> packages, e.g
>> data(aidssi, package="mstate"). But this does not work for survival. (The
>> larger
>> survival data sets that are in separate .rda files can be found.)
>>
>> c. What does work is survival::lung. Might it be useful to add a comment
>> to data.Rd to
>> this effect?
>
> You don't describe how this dataset is being included in your package. Have
> you moved it
> from data/lung.rda to data/cancer.rda? Currently (in survival 3.2-7) each of
> these works
> for me:
>
> library(survival); data(lung)
>
> library(survival); data("lung")
>
> # Without library(survival):
> data(lung, package="survival")
>
> I think if the lung dataset is now being included in cancer.rda, you'd need
>
> data(cancer, package="survival")
>
> or equivalent to load it (and the rest of the datasets there).
>
>>
>>
>> 3. Creating a separate package 'survivaldata' is of course one route, and is
>> suggested in
>> the "Writing R Extensions" guide. But this is not possible since survival
>> is a
>> recommended package: it can't load any non-recommended package for it's
>> tests or
>> vignettes. Longer term, perhaps there is way around this constraint?
>
> Maybe the solution is to put your datasets into the "datasets" package, or
> make
> "survivaldata" a recommended package, or just leave things as they are and
> ignore the
> warnings about package size. I think that's a negotiation you should have
> with R Core.
>
> Duncan Murdoch
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel