Re: [Rd] Function I mean not to export keeps being documented in a manual?

2018-12-17 Thread Thierry Onkelinx via R-devel
Dear Marta,

Add the @noRd tag to the Roxygen documentation of the function.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///




Op ma 17 dec. 2018 om 08:57 schreef Marta Karaś :

> I am developing an R package which has a function I decided not to export.
> I believe the roxygen information states that the function is not going to
> be exported, yet I still see the function in the manual PDF (as generated
> in command line via `CMD Rd2pdf package_dir`). What is wrong with my
> preamble that the function is still being documented in a manual?
>
> #' Generates plots for demo of package functions which take time series and
> #' window width parameters
> #'
> #' @param func runstats package core function
> #' @param plt.title.vec vector of function-specific plot titles
> #'
> #' @importFrom grDevices rgb
> #' @importFrom graphics abline lines par plot points polygon title
> #'
> #' @return \code{NULL}
> #'
> #' @examples
> #' \dontrun{
> #' func <- RunningMean
> #' vec <- c("black: x\nred: W-width running window",
> #'  "RunningMean(x, W)",
> #'  "RunningMean(x, W, circular = TRUE)")
> #' plot.no.pattern(func, vec)
> #' }
> #'
> #'
> plot.no.pattern <- function(func, plt.title.vec){
> ...
> }
>
> Bests / Pozdrawiam,
> Marta Karas
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Martin Maechler
> David Hugh-Jones 
> on Sat, 15 Dec 2018 08:47:28 +0100 writes:

> I would argue examples should encourage good
> practice. Beginners ought to learn to keep data in data
> frames and not to overuse attach(). 

Note there's no attach() there in any of these examples!

> otherwise at their own risk, but they have less need of
> explicit examples.

The glm examples are nice in sofar they show both uses.

I agree the lm() example(s) are  "didactically misleading" by
not using data frames at all.

I disagree that only data frame examples should be shown.
If  lm()  is one of the first R functions a beginneR must use --
because they are in a basic stats class, say --  it may be
*better* didactically to focus on lm()  in the very first
example, and use data frames in a next one ...
 and instead of next one, we have the pretty clear comment
 
  ### less simple examples in "See Also" above

I'm not convinced (but you can try more) we should change those
examples or add more there.

Martin

> On Fri, 14 Dec 2018 at 14:51, S Ellison
>  wrote:

>> FWIW, before all the examples are changed to data frame
>> variants, I think there's fairly good reason to have at
>> least _one_ example that does _not_ place variables in a
>> data frame.
>> 
>> The data argument in lm() is optional. And there is more
>> than one way to manage data in a project. I personally
>> don't much like lots of stray variables lurking about,
>> but if those are the only variables out there and we can
>> be sure they aren't affected by other code, it's hardly
>> essential to create a data frame to hold something you
>> already have.  Also, attach() is still part of R, for
>> those folk who have a data frame but want to reference
>> the contents across a wider range of functions without
>> using with() a lot. lm() can reasonably omit the data
>> argument there, too.
>> 
>> So while there are good reasons to use data frames, there
>> are also good reasons to provide examples that don't.
>> 
>> Steve Ellison
>> 
>> 
>> > -Original Message- > From: R-devel
>> [mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
>> Bolker > Sent: 13 December 2018 20:36 > To:
>> r-devel@r-project.org > Subject: Re: [Rd] Documentation
>> examples for lm and glm
>> >
>> >
>> > Agree.  Or just create the data frame with those
>> variables in it > directly ...
>> >
>> > On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
>> > >
>> > > something that has been on my mind for a decade or
>> two has > > been the examples for lm() and glm(). They
>> encourage poor style > > because of mismanagement of data
>> frames. Also, having the > > variables in a data frame
>> means that predict() > > is more likely to work properly.
>> > >
>> > > For lm(), the variables should be put into a data
>> frame.  > > As 2 vectors are assigned first in the
>> general workspace they > > should be deleted afterwards.
>> > >
>> > > For the glm(), the data frame d.AD is constructed but
>> not used. Also, > > its 3 components were assigned first
>> in the general workspace, so they > > float around
>> dangerously afterwards like in the lm() example.
>> > >
>> > > Rather than attached improved .Rd files here, they
>> are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
>> You are welcome to use them!
>> > >
>> > > Best,
>> > >
>> > > Thomas
>> > >
>> > > __ > >
>> R-devel@r-project.org mailing list > >
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> > __ >
>> R-devel@r-project.org mailing list >
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> 
>> ***
>> This email and any attachments are confidential. Any
>> u...{{dropped:12}}

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Martin,

I think that everyone agrees that it’s generally preferable to use the data 
argument to lm() and I have nothing significant to add to the substance of the 
discussion, but I think that it’s a mistake not to add to the current examples, 
for the following reasons:

(1) Relegating examples using the data argument to “see also” doesn’t suggest 
that using the argument is a best practice. Most users won’t bother to click 
the links.

(2) In my opinion, an new initial example using the data argument would more 
clearly suggest that this is the normally the best option.

(3) I think that it would also be desirable to add a remark to the explanation 
of the data argument, something like, “Although the argument is optional, it's 
generally preferable to specify it explicitly.” And similarly on the help page 
for glm().

My two (or three) cents.

John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 17, 2018, at 3:05 AM, Martin Maechler  
> wrote:
> 
>> David Hugh-Jones 
>>on Sat, 15 Dec 2018 08:47:28 +0100 writes:
> 
>> I would argue examples should encourage good
>> practice. Beginners ought to learn to keep data in data
>> frames and not to overuse attach(). 
> 
> Note there's no attach() there in any of these examples!
> 
>> otherwise at their own risk, but they have less need of
>> explicit examples.
> 
> The glm examples are nice in sofar they show both uses.
> 
> I agree the lm() example(s) are  "didactically misleading" by
> not using data frames at all.
> 
> I disagree that only data frame examples should be shown.
> If  lm()  is one of the first R functions a beginneR must use --
> because they are in a basic stats class, say --  it may be
> *better* didactically to focus on lm()  in the very first
> example, and use data frames in a next one ...
>  and instead of next one, we have the pretty clear comment
> 
>  ### less simple examples in "See Also" above
> 
> I'm not convinced (but you can try more) we should change those
> examples or add more there.
> 
> Martin
> 
>> On Fri, 14 Dec 2018 at 14:51, S Ellison
>>  wrote:
> 
>>> FWIW, before all the examples are changed to data frame
>>> variants, I think there's fairly good reason to have at
>>> least _one_ example that does _not_ place variables in a
>>> data frame.
>>> 
>>> The data argument in lm() is optional. And there is more
>>> than one way to manage data in a project. I personally
>>> don't much like lots of stray variables lurking about,
>>> but if those are the only variables out there and we can
>>> be sure they aren't affected by other code, it's hardly
>>> essential to create a data frame to hold something you
>>> already have.  Also, attach() is still part of R, for
>>> those folk who have a data frame but want to reference
>>> the contents across a wider range of functions without
>>> using with() a lot. lm() can reasonably omit the data
>>> argument there, too.
>>> 
>>> So while there are good reasons to use data frames, there
>>> are also good reasons to provide examples that don't.
>>> 
>>> Steve Ellison
>>> 
>>> 
 -Original Message- > From: R-devel
>>> [mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
>>> Bolker > Sent: 13 December 2018 20:36 > To:
>>> r-devel@r-project.org > Subject: Re: [Rd] Documentation
>>> examples for lm and glm
 
 
 Agree.  Or just create the data frame with those
>>> variables in it > directly ...
 
 On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
> 
> something that has been on my mind for a decade or
>>> two has > > been the examples for lm() and glm(). They
>>> encourage poor style > > because of mismanagement of data
>>> frames. Also, having the > > variables in a data frame
>>> means that predict() > > is more likely to work properly.
> 
> For lm(), the variables should be put into a data
>>> frame.  > > As 2 vectors are assigned first in the
>>> general workspace they > > should be deleted afterwards.
> 
> For the glm(), the data frame d.AD is constructed but
>>> not used. Also, > > its 3 components were assigned first
>>> in the general workspace, so they > > float around
>>> dangerously afterwards like in the lm() example.
> 
> Rather than attached improved .Rd files here, they
>>> are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
>>> You are welcome to use them!
> 
> Best,
> 
> Thomas
> 
> __ > >
>>> R-devel@r-project.org mailing list > >
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
 
 __ >
>>> R-devel@r-project.org mailing list >
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> 
>>> ***
>>> This email and any attachments are confidential. Any
>>> u...{{dropped:1

Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread S Ellison


> From: Thomas Yee [mailto:t@auckland.ac.nz]
> 
> Thanks for the discussion. I do feel quite strongly that
> the variables should always be a part of a data frame. 

This seems pretty much a decision for R core, and I think it's useful to have 
raised the issue.

But I, er, feel strongly that strong feelings and 'always' are unsafe in a best 
practice argument. 

First, other folk with different use-cases or work practice may see 'best 
practice' quite differently. So I would pretty much always expect exceptions.

Second, for examples of capability, there are too many exceptions in this 
instance. For example:
glm() can take a two-column matrix as a single response variable. 
lm() can take a matrix as a response variable. 
lm() can take a complete data frame as a predictor (see ?stackloss)

None of these work naturally if everything is in a data frame, and some won’t 
work at all.

Steve E




***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Steve,

Since this relates as well to the message I posted a couple of minutes before 
yours, I agree that it’s possible to phrase “best practices” too categorically. 
In the current case, I believe that it’s reasonable to say that specifying the 
data argument is “generally” or “usually” the best option. That doesn’t rule 
out exceptions.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 17, 2018, at 7:49 AM, S Ellison  wrote:
> 
> 
> 
>> From: Thomas Yee [mailto:t@auckland.ac.nz]
>> 
>> Thanks for the discussion. I do feel quite strongly that
>> the variables should always be a part of a data frame. 
> 
> This seems pretty much a decision for R core, and I think it's useful to have 
> raised the issue.
> 
> But I, er, feel strongly that strong feelings and 'always' are unsafe in a 
> best practice argument. 
> 
> First, other folk with different use-cases or work practice may see 'best 
> practice' quite differently. So I would pretty much always expect exceptions.
> 
> Second, for examples of capability, there are too many exceptions in this 
> instance. For example:
> glm() can take a two-column matrix as a single response variable. 
> lm() can take a matrix as a response variable. 
> lm() can take a complete data frame as a predictor (see ?stackloss)
> 
> None of these work naturally if everything is in a data frame, and some won’t 
> work at all.
> 
> Steve E
> 
> 
> 
> 
> ***
> This email and any attachments are confidential. Any use, copying or
> disclosure other than by the intended recipient is unauthorised. If 
> you have received this message in error, please notify the sender 
> immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
> and delete this message and any copies from your computer and network. 
> LGC Limited. Registered in England 2991879. 
> Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Heinz Tuechler

Dear All,

do you think that use of a data argument is best practice in the example 
below?


regards,

Heinz

### trivial example
plotwithline <- function(x, y) {
plot(x, y)
abline(lm(y~x)) ## data argument?
}

set.seed(25)
df0 <- data.frame(x=rnorm(20), y=rnorm(20))

plotwithline(df0[['x']], df0[['y']])



Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:

Dear Martin,

I think that everyone agrees that it’s generally preferable to use the data 
argument to lm() and I have nothing significant to add to the substance of the 
discussion, but I think that it’s a mistake not to add to the current examples, 
for the following reasons:

(1) Relegating examples using the data argument to “see also” doesn’t suggest 
that using the argument is a best practice. Most users won’t bother to click 
the links.

(2) In my opinion, an new initial example using the data argument would more 
clearly suggest that this is the normally the best option.

(3) I think that it would also be desirable to add a remark to the explanation 
of the data argument, something like, “Although the argument is optional, it's 
generally preferable to specify it explicitly.” And similarly on the help page 
for glm().

My two (or three) cents.

John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox


On Dec 17, 2018, at 3:05 AM, Martin Maechler  wrote:


David Hugh-Jones
   on Sat, 15 Dec 2018 08:47:28 +0100 writes:



I would argue examples should encourage good
practice. Beginners ought to learn to keep data in data
frames and not to overuse attach().


Note there's no attach() there in any of these examples!


otherwise at their own risk, but they have less need of
explicit examples.


The glm examples are nice in sofar they show both uses.

I agree the lm() example(s) are  "didactically misleading" by
not using data frames at all.

I disagree that only data frame examples should be shown.
If  lm()  is one of the first R functions a beginneR must use --
because they are in a basic stats class, say --  it may be
*better* didactically to focus on lm()  in the very first
example, and use data frames in a next one ...
 and instead of next one, we have the pretty clear comment

 ### less simple examples in "See Also" above

I'm not convinced (but you can try more) we should change those
examples or add more there.

Martin


On Fri, 14 Dec 2018 at 14:51, S Ellison
 wrote:



FWIW, before all the examples are changed to data frame
variants, I think there's fairly good reason to have at
least _one_ example that does _not_ place variables in a
data frame.

The data argument in lm() is optional. And there is more
than one way to manage data in a project. I personally
don't much like lots of stray variables lurking about,
but if those are the only variables out there and we can
be sure they aren't affected by other code, it's hardly
essential to create a data frame to hold something you
already have.  Also, attach() is still part of R, for
those folk who have a data frame but want to reference
the contents across a wider range of functions without
using with() a lot. lm() can reasonably omit the data
argument there, too.

So while there are good reasons to use data frames, there
are also good reasons to provide examples that don't.

Steve Ellison



-Original Message- > From: R-devel

[mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
Bolker > Sent: 13 December 2018 20:36 > To:
r-devel@r-project.org > Subject: Re: [Rd] Documentation
examples for lm and glm



Agree.  Or just create the data frame with those

variables in it > directly ...


On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,


something that has been on my mind for a decade or

two has > > been the examples for lm() and glm(). They
encourage poor style > > because of mismanagement of data
frames. Also, having the > > variables in a data frame
means that predict() > > is more likely to work properly.


For lm(), the variables should be put into a data

frame.  > > As 2 vectors are assigned first in the
general workspace they > > should be deleted afterwards.


For the glm(), the data frame d.AD is constructed but

not used. Also, > > its 3 components were assigned first
in the general workspace, so they > > float around
dangerously afterwards like in the lm() example.


Rather than attached improved .Rd files here, they

are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
You are welcome to use them!


Best,

Thomas

__ > >

R-devel@r-project.org mailing list > >
https://stat.ethz.ch/mailman/listinfo/r-devel


__ >

R-devel@r-project.org mailing list >
https://stat.ethz.ch/mailman/listinfo/r-devel


***
This email and any attachments are confidential. Any
u...{{dropped:12}}




[Rd] CRAN incoming queue closed from Dec 21 to Jan 02

2018-12-17 Thread Uwe Ligges

Dear package developers,

the CRAN incoming queue will be closed from Dec 21, 2018 to Jan 02, 
2019. Hence package submissions are only possible before and after that 
period.


Best,
Uwe Ligges
(for the CRAN team)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Unnecessary apostrophe in English base::summary() NA count output?

2018-12-17 Thread Hernando Cortina
 Hello, this is quite a minor issue but as summary() is in all likelihood
one of the most widely used functions in R I decided to email this list.
When producing a count of missing values, summary() in English generates an
unnecessary and grammatically incorrect apostrophe (NA's rather than NAs)
in its table header.  For example:

> summary(c(1,2,NA,3,4,NA))
   Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
   1.001.752.502.503.254.00   2

The issue can be traced to this file:
https://svn.r-project.org/R/trunk/src/library/base/R/summary.R
Unless this is being done intentionally for some reason, the solution would
seem to be to replace the string "NA's" with "NAs".  There are 9
occurrences in the file.

Thank you very much.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Heinz,

  --
> On Dec 17, 2018, at 10:19 AM, Heinz Tuechler  wrote:
> 
> Dear All,
> 
> do you think that use of a data argument is best practice in the example 
> below?

No, but it is *normally* or *usually* the best option, in my opinion.

Best,
 John

> 
> regards,
> 
> Heinz
> 
> ### trivial example
> plotwithline <- function(x, y) {
>plot(x, y)
>abline(lm(y~x)) ## data argument?
> }
> 
> set.seed(25)
> df0 <- data.frame(x=rnorm(20), y=rnorm(20))
> 
> plotwithline(df0[['x']], df0[['y']])
> 
> 
> 
> Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:
>> Dear Martin,
>> 
>> I think that everyone agrees that it’s generally preferable to use the data 
>> argument to lm() and I have nothing significant to add to the substance of 
>> the discussion, but I think that it’s a mistake not to add to the current 
>> examples, for the following reasons:
>> 
>> (1) Relegating examples using the data argument to “see also” doesn’t 
>> suggest that using the argument is a best practice. Most users won’t bother 
>> to click the links.
>> 
>> (2) In my opinion, an new initial example using the data argument would more 
>> clearly suggest that this is the normally the best option.
>> 
>> (3) I think that it would also be desirable to add a remark to the 
>> explanation of the data argument, something like, “Although the argument is 
>> optional, it's generally preferable to specify it explicitly.” And similarly 
>> on the help page for glm().
>> 
>> My two (or three) cents.
>> 
>> John
>> 
>>  -
>>  John Fox, Professor Emeritus
>>  McMaster University
>>  Hamilton, Ontario, Canada
>>  Web: http::/socserv.mcmaster.ca/jfox
>> 
>>> On Dec 17, 2018, at 3:05 AM, Martin Maechler  
>>> wrote:
>>> 
 David Hugh-Jones
   on Sat, 15 Dec 2018 08:47:28 +0100 writes:
>>> 
 I would argue examples should encourage good
 practice. Beginners ought to learn to keep data in data
 frames and not to overuse attach().
>>> 
>>> Note there's no attach() there in any of these examples!
>>> 
 otherwise at their own risk, but they have less need of
 explicit examples.
>>> 
>>> The glm examples are nice in sofar they show both uses.
>>> 
>>> I agree the lm() example(s) are  "didactically misleading" by
>>> not using data frames at all.
>>> 
>>> I disagree that only data frame examples should be shown.
>>> If  lm()  is one of the first R functions a beginneR must use --
>>> because they are in a basic stats class, say --  it may be
>>> *better* didactically to focus on lm()  in the very first
>>> example, and use data frames in a next one ...
>>>  and instead of next one, we have the pretty clear comment
>>> 
>>> ### less simple examples in "See Also" above
>>> 
>>> I'm not convinced (but you can try more) we should change those
>>> examples or add more there.
>>> 
>>> Martin
>>> 
 On Fri, 14 Dec 2018 at 14:51, S Ellison
  wrote:
>>> 
> FWIW, before all the examples are changed to data frame
> variants, I think there's fairly good reason to have at
> least _one_ example that does _not_ place variables in a
> data frame.
> 
> The data argument in lm() is optional. And there is more
> than one way to manage data in a project. I personally
> don't much like lots of stray variables lurking about,
> but if those are the only variables out there and we can
> be sure they aren't affected by other code, it's hardly
> essential to create a data frame to hold something you
> already have.  Also, attach() is still part of R, for
> those folk who have a data frame but want to reference
> the contents across a wider range of functions without
> using with() a lot. lm() can reasonably omit the data
> argument there, too.
> 
> So while there are good reasons to use data frames, there
> are also good reasons to provide examples that don't.
> 
> Steve Ellison
> 
> 
>> -Original Message- > From: R-devel
> [mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
> Bolker > Sent: 13 December 2018 20:36 > To:
> r-devel@r-project.org > Subject: Re: [Rd] Documentation
> examples for lm and glm
>> 
>> 
>> Agree.  Or just create the data frame with those
> variables in it > directly ...
>> 
>> On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
>>> 
>>> something that has been on my mind for a decade or
> two has > > been the examples for lm() and glm(). They
> encourage poor style > > because of mismanagement of data
> frames. Also, having the > > variables in a data frame
> means that predict() > > is more likely to work properly.
>>> 
>>> For lm(), the variables should be put into a data
> frame.  > > As 2 vectors are assigned first in the
> general workspace they > > should be deleted afterwards.
>>> 
>>> For the glm(), the 

Re: [Rd] Unnecessary apostrophe in English base::summary() NA count output?

2018-12-17 Thread Ben Bolker
There seem to be a variety of opinions about style in this case; do
you omit the apostrophe ("NAs") because it's not a possessive or a
contraction, or do you include the apostrophe ("NA's") to clearly
distinguish the acronym from the plural form?

 I personally prefer "NAs" to "NA's" but both are defensible.

https://english.stackexchange.com/questions/55970/plurals-of-acronyms-letters-numbers-use-an-apostrophe-or-not
https://brians.wsu.edu/2016/05/16/acronyms-and-apostrophes/ ("many
people object to it")

On Mon, Dec 17, 2018 at 10:20 AM Hernando Cortina  wrote:
>
>  Hello, this is quite a minor issue but as summary() is in all likelihood
> one of the most widely used functions in R I decided to email this list.
> When producing a count of missing values, summary() in English generates an
> unnecessary and grammatically incorrect apostrophe (NA's rather than NAs)
> in its table header.  For example:
>
> > summary(c(1,2,NA,3,4,NA))
>Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
>1.001.752.502.503.254.00   2
>
> The issue can be traced to this file:
> https://svn.r-project.org/R/trunk/src/library/base/R/summary.R
> Unless this is being done intentionally for some reason, the solution would
> seem to be to replace the string "NA's" with "NAs".  There are 9
> occurrences in the file.
>
> Thank you very much.
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Heinz Tuechler

Dear John,

fully agreed! In the global environment I always keep my 
"data-variables" in a data.frame. However, if I look in help I like 
examples that start with the particular aspects of a function. It is 
important to know, if a function offers a data argument, but in the 
first line I don't need an example for the use of a data argument each 
time I look in help.


best,
Heinz

Fox, John wrote/hat geschrieben on/am 17.12.2018 16:23:

Dear Heinz,

  --

On Dec 17, 2018, at 10:19 AM, Heinz Tuechler  wrote:

Dear All,

do you think that use of a data argument is best practice in the example below?


No, but it is *normally* or *usually* the best option, in my opinion.

Best,
 John



regards,

Heinz

### trivial example
plotwithline <- function(x, y) {
   plot(x, y)
   abline(lm(y~x)) ## data argument?
}

set.seed(25)
df0 <- data.frame(x=rnorm(20), y=rnorm(20))

plotwithline(df0[['x']], df0[['y']])



Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:

Dear Martin,

I think that everyone agrees that it’s generally preferable to use the data 
argument to lm() and I have nothing significant to add to the substance of the 
discussion, but I think that it’s a mistake not to add to the current examples, 
for the following reasons:

(1) Relegating examples using the data argument to “see also” doesn’t suggest 
that using the argument is a best practice. Most users won’t bother to click 
the links.

(2) In my opinion, an new initial example using the data argument would more 
clearly suggest that this is the normally the best option.

(3) I think that it would also be desirable to add a remark to the explanation 
of the data argument, something like, “Although the argument is optional, it's 
generally preferable to specify it explicitly.” And similarly on the help page 
for glm().

My two (or three) cents.

John

 -
 John Fox, Professor Emeritus
 McMaster University
 Hamilton, Ontario, Canada
 Web: http::/socserv.mcmaster.ca/jfox


On Dec 17, 2018, at 3:05 AM, Martin Maechler  wrote:


David Hugh-Jones
  on Sat, 15 Dec 2018 08:47:28 +0100 writes:



I would argue examples should encourage good
practice. Beginners ought to learn to keep data in data
frames and not to overuse attach().


Note there's no attach() there in any of these examples!


otherwise at their own risk, but they have less need of
explicit examples.


The glm examples are nice in sofar they show both uses.

I agree the lm() example(s) are  "didactically misleading" by
not using data frames at all.

I disagree that only data frame examples should be shown.
If  lm()  is one of the first R functions a beginneR must use --
because they are in a basic stats class, say --  it may be
*better* didactically to focus on lm()  in the very first
example, and use data frames in a next one ...
 and instead of next one, we have the pretty clear comment

### less simple examples in "See Also" above

I'm not convinced (but you can try more) we should change those
examples or add more there.

Martin


On Fri, 14 Dec 2018 at 14:51, S Ellison
 wrote:



FWIW, before all the examples are changed to data frame
variants, I think there's fairly good reason to have at
least _one_ example that does _not_ place variables in a
data frame.

The data argument in lm() is optional. And there is more
than one way to manage data in a project. I personally
don't much like lots of stray variables lurking about,
but if those are the only variables out there and we can
be sure they aren't affected by other code, it's hardly
essential to create a data frame to hold something you
already have.  Also, attach() is still part of R, for
those folk who have a data frame but want to reference
the contents across a wider range of functions without
using with() a lot. lm() can reasonably omit the data
argument there, too.

So while there are good reasons to use data frames, there
are also good reasons to provide examples that don't.

Steve Ellison



-Original Message- > From: R-devel

[mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
Bolker > Sent: 13 December 2018 20:36 > To:
r-devel@r-project.org > Subject: Re: [Rd] Documentation
examples for lm and glm



Agree.  Or just create the data frame with those

variables in it > directly ...


On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,


something that has been on my mind for a decade or

two has > > been the examples for lm() and glm(). They
encourage poor style > > because of mismanagement of data
frames. Also, having the > > variables in a data frame
means that predict() > > is more likely to work properly.


For lm(), the variables should be put into a data

frame.  > > As 2 vectors are assigned first in the
general workspace they > > should be deleted afterwards.


For the glm(), the data frame d.AD is constructed but

not used. Also, > > its 3 components were assigned first
in the general workspace, so they > >

[Rd] R is missing log1p(z) etc for complex numbers z.

2018-12-17 Thread Martin Maechler
Working on my 'Bessel' package, I've re-detected today, that
indeed even C99 standard GLIBC does not contain, a complex
number version of 

   log1p()

Further missing in current R, are, basically these

> z <- 1 + 2i
> log1p(z)
Error in log1p(z) : unimplemented complex function
> expm1(z)
Error in expm1(z) : unimplemented complex function
> gamma(z)
Error in gamma(z) : unimplemented complex function
> lgamma(z)
Error in lgamma(z) : unimplemented complex function
> psigamma(z)
Error in psigamma(z) : unimplemented complex function
> digamma(z)
Error in digamma(z) : unimplemented complex function
> sinpi(z)
Error in sinpi(z) : unimplemented complex function
> cospi(z)
Error in cospi(z) : unimplemented complex function
> floor(z)
Error in floor(z) : unimplemented complex function
> ceiling(z)
Error in ceiling(z) : unimplemented complex function
> 
--

Is anyone aware of Free Software implementations of these,
ideally in C ?

... yes, I think I've found the Julia source code for these,
nicely written in Julia itself...

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel