Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert
Would "Suggests" not work in this situation? I don't understand why you 
would need Depends. In what sense do you rely on the data only package?


Paul

On 13-12-06 04:20 PM, Hadley Wickham wrote:

Hi all,

What should you do when you rely on a data only package. If you just
"Depend" on it, you get the following from R CMD check:

Package in Depends field not imported from: 'hflights'
   These packages needs to imported from for the case when
   this namespace is loaded but not attached.

But there's nothing in the namespace to import, so adding it to
imports doesn't seem like the right answer.  Is that just a spurious
note?

Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Gábor Csárdi
I don't know about this particular case, but in general it makes sense
to rely on a data package. E.g. I am creating a package that does
Bayesian inference for a particular problem, potentially relying on
prior knowledge. I think it makes sense to put the data that is used
to calculate the prior into another package, because it will be larger
than the code, and it does not change that often.

Gabor

On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert  wrote:
> Would "Suggests" not work in this situation? I don't understand why you
> would need Depends. In what sense do you rely on the data only package?
>
> Paul
>
> On 13-12-06 04:20 PM, Hadley Wickham wrote:
>>
>> Hi all,
>>
>> What should you do when you rely on a data only package. If you just
>> "Depend" on it, you get the following from R CMD check:
>>
>> Package in Depends field not imported from: 'hflights'
>>These packages needs to imported from for the case when
>>this namespace is loaded but not attached.
>>
>> But there's nothing in the namespace to import, so adding it to
>> imports doesn't seem like the right answer.  Is that just a spurious
>> note?
>>
>> Hadley
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Hadley Wickham
> Would "Suggests" not work in this situation? I don't understand why you
> would need Depends. In what sense do you rely on the data only package?

Because I want someone who downloads the package to be able to run the
examples without having to take additional action.

Hadley


-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert



On 13-12-07 12:19 PM, Gábor Csárdi wrote:

I don't know about this particular case, but in general it makes sense
to rely on a data package. E.g. I am creating a package that does
Bayesian inference for a particular problem, potentially relying on
prior knowledge. I think it makes sense to put the data that is used
to calculate the prior into another package, because it will be larger
than the code, and it does not change that often.

Gabor

On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert  wrote:

Would "Suggests" not work in this situation? I don't understand why you
would need Depends. In what sense do you rely on the data only package?



HW> Because I want someone who downloads the package to be able to run
HW> the examples without having to take additional action.
HW>
HW> Hadley

I went through this myself, including thinking it was a nuisance for 
users to need to attach other packages to run examples. In the end I 
decided it is not so bad to be explicit about what package the example 
data comes from, so illustrate it in the examples. Users may not always 
want this data, and other packages that build on yours probably do not 
want it.


Even in the Bayesian inference case pointed out by Gábor, I am not 
convinced. It means the prior knowledge base cannot be exchanged for 
another one. The package would be more general if it allowed the 
possibility of attaching a different database of prior information. But 
this is clearly a more important case, since the code probably does not 
work without some database. (There are a few other situations where 
something like "RequireOneOf:" would be useful.)


Paul


Paul

On 13-12-06 04:20 PM, Hadley Wickham wrote:


Hi all,

What should you do when you rely on a data only package. If you just
"Depend" on it, you get the following from R CMD check:

Package in Depends field not imported from: 'hflights'
These packages needs to imported from for the case when
this namespace is loaded but not attached.

But there's nothing in the namespace to import, so adding it to
imports doesn't seem like the right answer.  Is that just a spurious
note?

Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Gabor Grothendieck
On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert  wrote:
>
>
> On 13-12-07 12:19 PM, Gábor Csárdi wrote:
>>
>> I don't know about this particular case, but in general it makes sense
>> to rely on a data package. E.g. I am creating a package that does
>> Bayesian inference for a particular problem, potentially relying on
>> prior knowledge. I think it makes sense to put the data that is used
>> to calculate the prior into another package, because it will be larger
>> than the code, and it does not change that often.
>>
>> Gabor
>>
>> On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert 
>> wrote:
>>>
>>> Would "Suggests" not work in this situation? I don't understand why you
>>> would need Depends. In what sense do you rely on the data only package?
>>>
>
> HW> Because I want someone who downloads the package to be able to run
> HW> the examples without having to take additional action.
> HW>
> HW> Hadley
>
> I went through this myself, including thinking it was a nuisance for users
> to need to attach other packages to run examples. In the end I decided it is
> not so bad to be explicit about what package the example data comes from, so
> illustrate it in the examples. Users may not always want this data, and
> other packages that build on yours probably do not want it.
>
> Even in the Bayesian inference case pointed out by Gábor, I am not
> convinced. It means the prior knowledge base cannot be exchanged for another
> one. The package would be more general if it allowed the possibility of
> attaching a different database of prior information. But this is clearly a
> more important case, since the code probably does not work without some
> database. (There are a few other situations where something like
> "RequireOneOf:" would be useful.)
>

Requiring users to load packages which could be loaded automatically
seems to go against ease of use.  Its just one more thing that they
have to remember to do.

It really should be possible to write a "batteries included" package
while leveraging off of other packages.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Gábor Csárdi
On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert  wrote:
>
>
> On 13-12-07 12:19 PM, Gábor Csárdi wrote:
>>
>> I don't know about this particular case, but in general it makes sense
>> to rely on a data package. E.g. I am creating a package that does
>> Bayesian inference for a particular problem, potentially relying on
>> prior knowledge. I think it makes sense to put the data that is used
>> to calculate the prior into another package, because it will be larger
>> than the code, and it does not change that often.
>>
>> Gabor
>>
>>
>> On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert 
>> wrote:
>>>
>>> Would "Suggests" not work in this situation? I don't understand why you
>>> would need Depends. In what sense do you rely on the data only package?
>>>
>
> HW> Because I want someone who downloads the package to be able to run
> HW> the examples without having to take additional action.
> HW>
> HW> Hadley
>
> I went through this myself, including thinking it was a nuisance for users
> to need to attach other packages to run examples. In the end I decided it is
> not so bad to be explicit about what package the example data comes from, so
> illustrate it in the examples. Users may not always want this data, and
> other packages that build on yours probably do not want it.
>
> Even in the Bayesian inference case pointed out by Gábor, I am not
> convinced. It means the prior knowledge base cannot be exchanged for another
> one. The package would be more general if it allowed the possibility of
> attaching a different database of prior information. But this is clearly a
> more important case, since the code probably does not work without some
> database. (There are a few other situations where something like
> "RequireOneOf:" would be useful.)

First, as you say, you went through this yourself, which means that
the "right" answer to the problem is not obvious. This is (mainly) a
design decision, and if it is not obvious that depending on a data
package is always bad design. Then why not let the package developer
decide?

Second, I very much think that using 'Suggests' is misleading in this
case. The data package is clearly required. I, as a user, would expect
that if I downloaded all requirements, then the package will work,
which is not true any more.

(Let's not go into what 'Suggests' actually means, and how totally
confusing it is already.)

'RequireOneOf' would be indeed useful.

Best,
Gabor

> Paul

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Dirk Eddelbuettel

On 7 December 2013 at 13:58, Gábor Csárdi wrote:
| 'RequireOneOf' would be indeed useful.

The DESCRIPTION file follows Debian Control File formats. Another aspect of
these could be useful here: the '|' operator. Eg for ess (the Debian package)
we have

  Depends: dpkg (>= 1.15.4) | install-info, emacs23 | emacs22 | emacs21 | 
emacsen

saying that either a recent enough dpkg [package tool] or the install-info
package are needed [to deal with .info files] and that one of the available
emacs versions will do, with the first one being the default choice and the
last one a virtual package providing a catch-all fallback.

The R package does similar things to pick one of several blas and lapack
packages:

  Depends: zip, unzip, libpaper-utils, xdg-utils, \
libblas3 | libblas.so.3 | libatlas3-base, [...] \
liblapack3 | liblapack.so.3 | libatlas3-base, [...]

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Duncan Murdoch

On 13-12-06 4:20 PM, Hadley Wickham wrote:

Hi all,

What should you do when you rely on a data only package. If you just
"Depend" on it, you get the following from R CMD check:

Package in Depends field not imported from: 'hflights'
   These packages needs to imported from for the case when
   this namespace is loaded but not attached.

But there's nothing in the namespace to import, so adding it to
imports doesn't seem like the right answer.  Is that just a spurious
note?

Hadley



I don't know whether the author of that note would consider it spurious 
or not.   A simple workaround for you (as the author of hflights) is to 
put a function into the namespace.  For example, get_hflights(), that 
gets a copy of the data:


get_hflights <- function() {
  data("hflights", package="hflights", envir=environment())
  hflights
}

I don't know a simple workaround for someone who depends on a data-only 
package that they did not author.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Hadley Wickham
> I don't know whether the author of that note would consider it spurious or
> not.   A simple workaround for you (as the author of hflights) is to put a
> function into the namespace.  For example, get_hflights(), that gets a copy
> of the data:
>
> get_hflights <- function() {
>   data("hflights", package="hflights", envir=environment())
>   hflights
> }

Another option is to put it in sysdata.rda and then export it - but
then (I think) you don't get the nice lazy loading.  It would be nice
to have some official guidance on what is preferred.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert



On 13-12-07 01:47 PM, Gabor Grothendieck wrote:

On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert  wrote:



On 13-12-07 12:19 PM, Gábor Csárdi wrote:


I don't know about this particular case, but in general it makes sense
to rely on a data package. E.g. I am creating a package that does
Bayesian inference for a particular problem, potentially relying on
prior knowledge. I think it makes sense to put the data that is used
to calculate the prior into another package, because it will be larger
than the code, and it does not change that often.

Gabor

On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert 
wrote:


Would "Suggests" not work in this situation? I don't understand why you
would need Depends. In what sense do you rely on the data only package?



HW> Because I want someone who downloads the package to be able to run
HW> the examples without having to take additional action.
HW>
HW> Hadley

I went through this myself, including thinking it was a nuisance for users
to need to attach other packages to run examples. In the end I decided it is
not so bad to be explicit about what package the example data comes from, so
illustrate it in the examples. Users may not always want this data, and
other packages that build on yours probably do not want it.

Even in the Bayesian inference case pointed out by Gábor, I am not
convinced. It means the prior knowledge base cannot be exchanged for another
one. The package would be more general if it allowed the possibility of
attaching a different database of prior information. But this is clearly a
more important case, since the code probably does not work without some
database. (There are a few other situations where something like
"RequireOneOf:" would be useful.)



Requiring users to load packages which could be loaded automatically
seems to go against ease of use.  Its just one more thing that they
have to remember to do.

It really should be possible to write a "batteries included" package
while leveraging off of other packages.

Just to be clear, I distinguish the "batteries included" situation from 
the "spare batteries included" situation. I think it should be possible 
to automatically load everything that is really needed, that is why I 
think the Bayesian database is a more important case. But it strikes me 
as bad to attach everything that could ever possibly be wanted by a 
user. After all, it would be possible to automatically attach all 
packages. Some packages seemed to be headed in that direction before the 
new rules started to be enforced.


There is certainly a trade-off here between ease of use, not needing the 
user to attach packages, and namespace conflicts, which will result in 
time and difficulty debugging. For packages that no one ever uses in 
other packages, there would be a tendency to lean toward ease of use. 
But as soon as anyone starts building on top of a package with another 
one, I think that avoiding potential conflicts will dominate.


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Gabriel Becker
The Writing R Extensions manual says that Suggests is for packages which
are required only for examples, which I believe matches Hadley's original
question.

In the Bayesian case, it seems like including one prior's worth of data in
the package but having the infrastructure designed so that other data can
be swapped in would allow all the examples to run without tying the
analysis to a signle prior.

Just my $0.02
~G


On Sat, Dec 7, 2013 at 2:08 PM, Paul Gilbert  wrote:

>
>
> On 13-12-07 01:47 PM, Gabor Grothendieck wrote:
>
>> On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert 
>> wrote:
>>
>>>
>>>
>>> On 13-12-07 12:19 PM, Gábor Csárdi wrote:
>>>

 I don't know about this particular case, but in general it makes sense
 to rely on a data package. E.g. I am creating a package that does
 Bayesian inference for a particular problem, potentially relying on
 prior knowledge. I think it makes sense to put the data that is used
 to calculate the prior into another package, because it will be larger
 than the code, and it does not change that often.

 Gabor

 On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert 
 wrote:

>
> Would "Suggests" not work in this situation? I don't understand why you
> would need Depends. In what sense do you rely on the data only package?
>
>
>>> HW> Because I want someone who downloads the package to be able to run
>>> HW> the examples without having to take additional action.
>>> HW>
>>> HW> Hadley
>>>
>>> I went through this myself, including thinking it was a nuisance for
>>> users
>>> to need to attach other packages to run examples. In the end I decided
>>> it is
>>> not so bad to be explicit about what package the example data comes
>>> from, so
>>> illustrate it in the examples. Users may not always want this data, and
>>> other packages that build on yours probably do not want it.
>>>
>>> Even in the Bayesian inference case pointed out by Gábor, I am not
>>> convinced. It means the prior knowledge base cannot be exchanged for
>>> another
>>> one. The package would be more general if it allowed the possibility of
>>> attaching a different database of prior information. But this is clearly
>>> a
>>> more important case, since the code probably does not work without some
>>> database. (There are a few other situations where something like
>>> "RequireOneOf:" would be useful.)
>>>
>>>
>> Requiring users to load packages which could be loaded automatically
>> seems to go against ease of use.  Its just one more thing that they
>> have to remember to do.
>>
>> It really should be possible to write a "batteries included" package
>> while leveraging off of other packages.
>>
>>  Just to be clear, I distinguish the "batteries included" situation from
> the "spare batteries included" situation. I think it should be possible to
> automatically load everything that is really needed, that is why I think
> the Bayesian database is a more important case. But it strikes me as bad to
> attach everything that could ever possibly be wanted by a user. After all,
> it would be possible to automatically attach all packages. Some packages
> seemed to be headed in that direction before the new rules started to be
> enforced.
>
> There is certainly a trade-off here between ease of use, not needing the
> user to attach packages, and namespace conflicts, which will result in time
> and difficulty debugging. For packages that no one ever uses in other
> packages, there would be a tendency to lean toward ease of use. But as soon
> as anyone starts building on top of a package with another one, I think
> that avoiding potential conflicts will dominate.
>
> Paul
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Hadley Wickham
> Just to be clear, I distinguish the "batteries included" situation from the
> "spare batteries included" situation. I think it should be possible to
> automatically load everything that is really needed, that is why I think the
> Bayesian database is a more important case. But it strikes me as bad to
> attach everything that could ever possibly be wanted by a user. After all,
> it would be possible to automatically attach all packages. Some packages
> seemed to be headed in that direction before the new rules started to be
> enforced.

I agree, but for data only packages, attaching the namespace has no
impact on other code because it's empty.

Hadley


-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Hadley Wickham
> The Writing R Extensions manual says that Suggests is for packages which
> are required only for examples, which I believe matches Hadley's original
> question.

Yes, but without this package they won't be able to run the majority
of examples, which I think delivers a poor experience to the user. It
also means I have to litter my examples with if(require("x")),
decreasing the signal to noise ratio in the examples.

But we're getting a bit far from my original question about the NOTE:

  Package in Depends field not imported from: 'hflights'
  These packages needs to imported from for the case when
  this namespace is loaded but not attached.

Depending on (or linking to) a package is not just about making the
functions in the package available.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert



On 13-12-07 05:21 PM, Hadley Wickham wrote:

The Writing R Extensions manual says that Suggests is for packages which
are required only for examples, which I believe matches Hadley's original
question.


Yes, but without this package they won't be able to run the majority
of examples, which I think delivers a poor experience to the user. It
also means I have to litter my examples with if(require("x")),


I think you just need require("x") or library("x"). If it is in Suggests 
then it is available whenever examples are tested, so you don't need the 
if(). In my opinion, this increases the signal by indicating to the 
reader where the data comes from.



decreasing the signal to noise ratio in the examples.

But we're getting a bit far from my original question about the NOTE:

   Package in Depends field not imported from: 'hflights'
   These packages needs to imported from for the case when
   this namespace is loaded but not attached.

Depending on (or linking to) a package is not just about making the
functions in the package available.


Several of us used to think that, but the modern interpretation seems to 
be just about making things in the package yours depends on available to 
users of your package. "Exports:" might be a better term than 
"Depends:", at least if Depends: was not trying to mean both Imports: 
and Exports:".


Paul


Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel