[R-pkg-devel] txt data is undefined global variable

2019-01-16 Thread Thierry Onkelinx
Dear all,

I'm working on a package "foo" which has a dataframe stored in
"data/foo.txt". The DESCRIPTION has "LazyData: true".  Functions can use
the object "foo". e.g.

bar <- function() {
summary(foo)
}

However, R CMD check throws the "undefined global variable" error.

What is the proper way to use txt data in a package?

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///



[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] txt data is undefined global variable

2019-01-16 Thread Uwe Ligges




On 16.01.2019 17:00, Thierry Onkelinx wrote:

Dear all,

I'm working on a package "foo" which has a dataframe stored in
"data/foo.txt". The DESCRIPTION has "LazyData: true".  Functions can use
the object "foo". e.g.


foo is not regisrered in the namespace and you are relying on search 
path order now.


If you want to have data objects you want to use in functions but which 
are not necesarily available via data() calls, use the sysdata 
mechanism. From WRE:


"if the R subdirectory contains a file sysdata.rda (a
saved image of one or more R objects: please use suitable compression as 
suggested by tools::resaveRdaFiles, and see also the 
‘SysDataCompression’ DESCRIPTION field.) this will be lazy-loaded into 
the namespace environment – this is intended for system datasets that 
are not intended to be user-accessible via data"


Best,
Uwe Ligges


bar <- function() {
summary(foo)
}

However, R CMD check throws the "undefined global variable" error.

What is the proper way to use txt data in a package?

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///



[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] shadowing a method from the stats package

2019-01-16 Thread Heather Turner
Hi Oscar,

I don't think this is such a good idea, for a few reasons.

First, `setMethod` defines an S4 method, so both an S3 method (from stats) and 
an S4 method (from your package) will be defined. This seems to work, but could 
give unexpected results and is confusing for anyone trying to understand what 
code is actually called. You could use `R.methodsS3::setMethodS3` to replace 
the summary.glm registered by the stats package, but then you're back to square 
1.

Second, it's not enough to replace `summary.glm`; you also need to replace 
`vcov.glm`, since this calls summary.glm internally and even if you have 
defined your own S3/S4 summary method for glms, the one from stats will be used 
in that case, since it is in the same namespace.

Third, if you can over-ride `summary.glm`, then another person could, 
potentially meaning your unifed family gets used to fit a glm, but then another 
summary method is called when summarising the fit. 

So I think the safest solution would be to write a wrapper around glm, 
`unifed_glm` say, that fits a glm with the unifed family and returns an object 
of class c("unifed", "glm", "lm"). Then you can write summary.unifed, 
vcov.unifed and any other required methods, passing on to the "glm" methods 
when you have set dispersion to 1. This also makes it easier to document the 
methods.

Alternatively you could try emailing the R-devel mailing list to propose that 
family objects should have an `est.disp` element that is FALSE for binomial, 
poisson and any other families that should have dispersion fixed to 1, and is 
TRUE otherwise. This would require minimal change to the stats package, while 
accommodating your unifed family.

Best wishes,

Heather

On Sat, Jan 12, 2019, at 10:02 PM, qxa...@use.startmail.com wrote:
> Hi Joris,
> 
>     thank you for your reply.
> 
>     I re-read the CRAN policy after I saw your reply and you are right,
> I do not need permission for this.
> 
>     About the search PATH. unifed will always be attached after stats,
> so I think the unifed function would always be found before the one in
> stats (please correct me if you think I am wrong).
> 
>     Still, I found a different way to solve the problem in order to
> avoid NOTEs when I run 'R CMD check --as-cran', that can be considered
> more robust:
> 
>     I first changed the name of my wrapper function to something
> different (this is in order to avoid a note in registering S3 methods).
> 
>     then, in the zzz.R file inside of the .onLoad function I used
> setMethod with where=.GlobalEnv to rewrite the summary method of the glm
> class and in there I call the wrapper function.
> 
>    This gives the behavior that I wanted and it does not depend on the
> search path.
> 
> Cheers,
> 
>    Oscar.
> 
> >
> > Then again, I'm not sure this will always work, as you rely heavily on
> > the search path for your function to be found before the one in stats.
> > I would be interested as well in hearing how this can be solved in a
> > more robust way, but I can't really come up with something myself.
> > Interesting problem!
> >
> > Cheers
> > Joris
> >
> > On Fri, Jan 11, 2019 at 10:14 PM  > > wrote:
> >
> > Hello,
> >
> >     I created a package for working with a new probability
> > distribution called unifed. The source code can be found at
> > https://gitlab.com/oquijano/unifed .
> >
> >     This distribution is suitable for GLMs. I have included a a
> > function called unifed in the package that returns a family that can
> > be used with the glm function.
> >
> >     For a unifed glm, it is necessary for the dispersion parameter to
> > be equal to one. The summary method of the glm class does this
> > automatically for the poisson and binomial distributions and I would
> > like the same for the unifed. In order to achieve this, I am
> > including a summary.glm function in the package so it shadows
> > stats::summary.glm when the package is attached. This function is
> > actually just a wrapper around stats::summary.glm.  It simply checks
> > if the family is unifed; If this is the case it calls
> > stats::summary.glm with dispersion=1 and otherwise it simply calls
> > stats::summary.glm with the same parameters.  Therefore introducing
> > this in the namespace does not break or change the behavior of any
> > existing code that uses summary.glm
> >
> >      According to the CRAN policies I need permission from the
> > maintainer of the package for doing this. The maintainer of the
> > package
> > is the R core team. To whom should I write to ask for this permission?
> > Otherwise is there a different way in which I could achieve the right
> > default behavior and respect the CRAN policies?
> >
> >
> > Thank you for your time.
> >
> > __
> > R-package-devel@r-project.org
> > 

[R-pkg-devel] Avoiding differences between Examples output on Windows due to tibble

2019-01-16 Thread Gavin Simpson
Dear List,

The latest version of tibble 2.0.1 is printing output (on Linux at
least) that contains a UTF-8 ellipsis … instead of the previous three
periods ...

This is causing some differences between the reference materials for
my pkg examples that I generate locally on Linux and the output
generated on the Winbuilder system under Windows. In the latter, the
ellipsis is not being printed and three periods are being used
instead. As such I am seeing the following differences in Windows only
un R CMD check:

** checking differences from 'gratia-Ex_i386.Rout' to
'gratia-Ex.Rout.save' ... OK

239c239
< # ... with 790 more rows
---
> # … with 790 more rows
** running examples for arch 'x64' ... [24s] OK
** checking differences from 'gratia-Ex_x64.Rout' to
'gratia-Ex.Rout.save' ... OK

239c239
< # ... with 790 more rows
---
> # … with 790 more rows

I'm assuming this is going to be an issue for CRAN. If so, what can i
do about this? Is there some flag or trick I can use to generate the
same output on Windows as I produce on Linux?

The full sources for package in question are on github:
https://github.com/gavinsimpson/gratia should anyone want to check
something there.

Thanks in advance

Gavin

--
Gavin Simpson, PhD

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel