[Rd] Trivial error in documentation

2019-06-03 Thread David Scott

This must have been there for a while>

In the datasets package, the help for airquality says:

A data frame with 154 observations on 6 variables.

But:

> str(airquality)
'data.frame':    153 obs. of  6 variables:
 $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
 $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
 $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
 $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
 $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
 $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...

Regards

David Scott


--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Use of .Fortran

2010-06-19 Thread David Scott
Thanks very much to all who replied. I went with Brian's approach, and 
eventually, despite all my attempts to foul it up, I did get it to work 
successfully. For the record here are the details.


The subroutine is:

  subroutine SSFcoef(nmax,nu,A)
  implicit double precision(a-h,o-z)
  implicit integer (i-n)
  integer k,i,nmax
  double precision nu,A(0:nmax,0:nmax)
  A(0,0) = 1D0
  do k=1,nmax
do i=1,k-1
A(k,i) = (-nu+i+k-1D0)*A(k-1,i)+A(k-1,i-1)
end do
A(k,0) = (-nu+k-1D0)*A(k-1,0)
A(k,k) = 1D0
  end do
  return
  end

This was in the file SSFcoef.f95 and was made into a dll with

R CMD SHLIB SSFcoef.f95

Then calling it in R went like this:

### Load the compiled shared library in.
dyn.load("SSFcoef.dll")

### Write a function that calls the Fortran subroutine
SSFcoef <- function(nmax, nu){
   .Fortran("SSFcoef",
as.integer(nmax),
as.double(nu),
A = matrix(0, nmax+1, nmax+1)
)$A
}

SSFcoef(10,2)


There are a number of comments I should make.

Yes, Brian, should have gone to R-devel. I had forgotten about that.

I recognised from my faintly recalled past Fortran experience that the 
code was different and suspected a later Fortran, so good to be advised 
it was 95.


I actually gave a wrong version of the Fortran subroutine, one I had 
been messing around with and had added some extra arguments (nrowA and 
ncolA). As pointed out these were unnecessary.


Something which then caused me a bit of grief before I noticed it. 
Despite the 'implicit integer (i-n)' declaration in the subroutine, nu 
is later declared to be double so has to be specified as double in the R 
code.


Many thanks again, I at least learnt something about calling other 
language code from R.


David

Prof Brian Ripley wrote:

On Sat, 19 Jun 2010, David Scott wrote:

I have no experience with incorporating Fortran code and am probably doing 
something pretty stupid.


Surely you saw in the posting guide that R-help is not the place for 
questions about C, C++, Fortran code?  Diverting to R-devel.



I want to use the following Fortran subroutine (not written by me) in the


Well, it is not Fortran 77 but Fortran 95, and so needs to be given a 
.f95 extension to be sure to work.



file SSFcoef.f

 subroutine SSFcoef(nmax,nu,A,nrowA,ncolA)
 implicit double precision(a-h,o-z)
 implicit integer (i-n)
 integer l,i,nmax
 double precision nu,A(0:nmax,0:nmax)
 A(0,0) = 1D0
 do l=1,nmax
do i=1,l-1
A(l,i) = (-nu+i+l-1D0)*A(l-1,i)+A(l-1,i-1)
end do
A(l,0) = (-nu+l-1D0)*A(l-1,0)
A(l,l) = 1D0
 end do
 return
 end


I created a dll (this is windows) using R CMD SHLIB SSFcoef.f

Then my R code is:

### Load the compiled shared library in.
dyn.load("SSFcoef.dll")

### Write a function that calls the Fortran subroutine
SSFcoef <- function(nmax, nu){
 .Fortran("SSFcoef",
  as.integer(nmax),
  as.integer(nu)
  )$A
}


That does not match.  nrowA and ncolA are unused, so you need
SSFcoef <- function(nmax, nu){
   .Fortran("SSFcoef",
as.integer(nmax),
as.integer(nu),
A = matrix(0, nmax+1, nmax+1),
0L, 0L)$A
}



SSFcoef(10,2)

which when run gives


SSFcoef(10,2)

NULL

I am pretty sure the problem is that I am not dealing with the matrix A 
properly. I also tried this on linux and got a segfault.


Can anyone supply the appropriate modification to my call (and possibly to 
the subroutine) to make this work?


David Scott




--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to connect R to Mysql?

2010-09-18 Thread David Scott

On 18/09/2010 4:47 a.m., Spencer Graves wrote:

   Hi, Thomas:


You use RODBC to connect to MySQL?


Thanks, Spencer




I also use RODBC to connect to MySQL. That is in part because it is 
useful for other connections as well. I am using it for connecting to 
Microsoft SQL Server in a current project, and it is one of two 
approaches I commonly use to read Excel files (along with xlsReadWrite).


While I am at it, I would like to say thanks to Brian Ripley for his 
work on RODBC.


David Scott





On 9/17/2010 9:26 AM, Thomas Etheber wrote:

I also had problems connecting via RMysql on Windows several weeks ago.
I decided to skip the package and now use RODBC, which runs stable out
of the box. Perhaps you should have a look at this package.

Hth
Thomas

Am 17.09.2010 17:50, schrieb Spencer Graves:



   I've recently been through that with some success.  I don't
remember all the details, but I first looked at "help(pac=RMySQL)".
This told me that the maintainer was Jeffrey Horner.  Google told me
he was at Vanderbilt.  Eventually I found
"http://biostat.mc.vanderbilt.edu/wiki/Main/RMySQL";, which told me
that I needed to build the package myself so it matches your version
of MySQL, operating system, etc.  I did that.


   Does the MySQL database already exist?  I created a MySQL
database and tables using MySQL server 5.1.50-win32.  (Which version
of MySQL do you have?)


   help('RMySQL-package') includes "A typical usage".  That helped
me get started, except that I needed to write to that database, not
just query it.  For this, I think I got something like the following
to work:


d<- dbReadTable(con, "WL")
dbWriteTable(con, "WL2", a.data.frame)  ## table from a data.frame
dbWriteTable(con, "test2", "~/data/test2.csv") ## table from a file


   Hope this helps.
   Spencer


On 9/17/2010 7:55 AM, Arijeet Mukherjee wrote:

I installed the RMySql package in R 2.11.1 64 bit
Now how can I connect R with MySql?
I am using a windows 7 64 bit version.
Please help ASAP.






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Crash report: regexpr("a{2-}", "")

2010-09-22 Thread David Scott

 It crashes R on my linux:
> regexpr("a{2-}", "")
R: tre-compile.c:1825: tre_ast_to_tnfa: Assertion `iter->max == -1 || 
iter->max == 1' failed.

Aborted

My setup is:

> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-redhat-linux-gnu

locale:
 [1] LC_CTYPE=en_NZ   LC_NUMERIC=C LC_TIME=en_NZ
 [4] LC_COLLATE=en_NZ LC_MONETARY=CLC_MESSAGES=en_NZ
 [7] LC_PAPER=en_NZ   LC_NAME=CLC_ADDRESS=C
[10] LC_TELEPHONE=C   LC_MEASUREMENT=en_NZ LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] djsmisc_1.0-1


David Scott


On 23/09/10 04:37, Brian Diggs wrote:

[Accidentally posted this to r-help instead of r-devel; reposting to put
it into the correct list and thread. My apologies for the duplication.]

On 9/21/2010 8:04 PM, Henrik Bengtsson wrote:

Each of the following calls crash ("core dumps") R (R --vanilla) on
various versions and OSes:

regexpr("a{2-}", "")
sub("a{2-}", "")
gsub("a{2-}", "")


EXAMPLES:

To add another (windows) example it also crashes the 2.12.0 alpha build:

  >  sessionInfo()
R version 2.12.0 alpha (2010-09-20 r52948)
Platform: i386-pc-mingw32/i386 (32-bit)
...
  >  regexpr("a{2-}", "")
Assertion failed: iter->max == -1 || iter->max == 1, file tre-compile.c,
line 1825

This application has requested the Runtime to terminate it in an unusual
way.
Please contact the application's support team for more information.


sessionInfo()

R version 2.11.1 Patched (2010-09-16 r52949)
Platform: i386-pc-mingw32 (32-bit)
...

regexpr("a{2-}", "")

Assertion failed: iter->max == -1 || iter->max == 1, file
tre-compile.c, line 1825
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.


sessionInfo()

R version 2.12.0 Under development (unstable) (2010-09-14 r52910)
Platform: i386-pc-mingw32/i386 (32-bit)
...

regexpr("a{2-}", "")

Assertion failed: iter->max == -1 || iter->max == 1, file
tre-compile.c, line 1825
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.



sessionInfo()

R version 2.11.0 Patched (2010-05-09 r51960)
x86_64-unknown-linux-gnu
...

regexpr("a{2-}", "")

R: tre-compile.c:1825: tre_ast_to_tnfa: Assertion `iter->max == -1 ||
iter->max == 1' failed.
Aborted


/Henrik






--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggested change to integrate.Rd (was: Re: 0.5 != integrate(dnorm, 0, 20000) = 0)

2010-12-07 Thread David Scott
If changes are to be made to integrate it would be nice if the following 
was fixed:

> integrate(dnorm, -Inf, -Inf)
1 with absolute error < 9.4e-05

Note that integrate manages ok when not dealing with Inf or -Inf:
> integrate(dnorm, -500, -500)
0 with absolute error < 0

David Scott




On 8/12/2010 5:08 p.m., John Nolan wrote:

R developers understand intimately how things work, and terse
descriptions are sufficient.  However, most typical R users
would benefit from clearer documentation.  In multiple places
I've found the R documentation to be correct and understandable
AFTER I've figured a function out.

And to be fair, this problem with integrate( ) isn't really R's
fault: the QUADPACK routines that R uses are very good algorithms,
but neither they nor any other package can handle all cases.

I would support reasonable changes in the documentation for
integrate( ).   Just saying it "gives wrong answer without
warning on many systems" seems misleading (it works fine in
many cases) and it doesn't help a user understand how to use
integrate( ) correctly/carefully.  IMO a simple example like
this one w/ dnorm would catch peoples attention and a couple
lines of explanation/warning would then make more sense.

John Nolan, American U


-Spencer Graves  wrote: -
To: John Nolan
From: Spencer Graves
Date: 12/07/2010 07:58PM
Cc: pchau...@uwaterloo.ca, r-devel@r-project.org
Subject: Suggested change to integrate.Rd (was: Re: [Rd] 0.5 != 
integrate(dnorm,0,2) = 0)

What do you think about changing the verbiage with that example
in "integrate.Rd" from "fails on many systems" to something like
"gives wrong answer without warning on many systems"?


If I had write access to the core R code, I'd change this
myself:  I'm probably not the only user who might think that saying
something "fails" suggest it gives an error message.  Many contributions
on this thread make it clear that it will never be possible to write an
integrate function that won't give a "wrong answer without warning" in
some cases.


Thanks,
Spencer


#
On 12/7/2010 7:02 AM, John Nolan wrote:

Putting in Inf for the upper bound does not work in general:
all 3 of the following should give 0.5


integrate( dnorm, 0, Inf )

0.5 with absolute error<   4.7e-05


integrate( dnorm, 0, Inf, sd=10 )

Error in integrate(dnorm, 0, Inf, sd = 1e+05) :
the integral is probably divergent


integrate( dnorm, 0, Inf, sd=1000 )

5.570087e-05 with absolute error<   0.00010

Numerical quadrature methods look at a finite number of
points, and you can find examples that will confuse any
algorithm.  Rather than hope a general method will solve
all problems, users should look at their integrand and
pick an appropriate region of integration.

John Nolan, American U.


-r-devel-boun...@r-project.org wrote: -
To: r-devel@r-project.org
From: Pierre Chausse
Sent by: r-devel-boun...@r-project.org
Date: 12/07/2010 09:46AM
Subject: Re: [Rd] 0.5 != integrate(dnorm,0,2) = 0

The warning about "absolute error == 0" would not be sufficient
because if you do
   >   integrate(dnorm, 0, 5000)
2.326323e-06 with absolute error<   4.6e-06

We get reasonable absolute error and wrong answer. For very high upper
bound, it seems more stable to use "Inf". In that case, another
.External is used which seems to be optimized for high or low bounds:

   >   integrate(dnorm, 0,Inf)
0.5 with absolute error<   4.7e-05


On 10-12-07 8:38 AM, John Nolan wrote:

I have wrestled with this problem before.  I think correcting
the warning to "absolute error ~<= 0" is a good idea, and printing
a warning if subdivisions==1 is also helpful.  Also, including
a simple example like the one that started this thread on the
help page for integrate might make the issue more clear to users.

But min.subdivisions is probably not.  On the example with dnorm( ),
I doubt 3 subdivisions would work.  The problem isn't that
we aren't sudividing enough, the problem is that the integrand
is 0 (in double precision) on most of the region and the
algorithm isn't designed to handle this.  There is no way to
determine how many subdivisions are necessary to get a reasonable
answer without a detailed analysis of the integrand.

I've gotten useful results with integrands that are monotonic on
the tail with a "self-triming integration" routine
like the following:


right.trimmed.integrate<- function( f, lower, upper, epsilon=1e-100, 
min.width=1e-10, ... ) {

+ # trim the region of integration on the right until f(x)>epsilon
+
+ a<- lower; b<- upper
+ while ( (b-a>min.width)&&(f(b)
right.trimmed.integrate( dnorm, 0, 2 )  # test

0.5 with absolute error<9.2e-05

This can be adapted to left trim or (left and rig

[Rd] License statement

2010-12-22 Thread David Scott

 I am writing a package for a company for its internal use only.

What is an appropriate license statement for the DESCRIPTION file?

I would like a statement which reflects the private and proprietary 
nature of the package, giving copyright to the writer and the company. I 
also don't want to violate the licensing of R and the packages I am 
using (RODBC, ggplot2, zoo).


David Scott

--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: Suggestions for "good teaching" packages, esp. with C code

2011-02-15 Thread David Scott

On 16/02/2011 7:04 a.m., Paul Johnson wrote:

Hello,

I am looking for CRAN packages that don't teach bad habits.  Can I
have suggestions?

I don't mean the recommended packages that come with R, I mean the
contributed ones.  I've been sampling a lot of examples and am
surprised that many ignore seemingly agreed-upon principles of R
coding. In r-devel, almost everyone seems to support the "functional
programming" theme in Chambers's book on Software For Data Analysis,
but when I go look at randomly selected packages, programmers don't
follow that advice.

In particular:

1. Functions must avoid "mystery variables from nowhere."

Consider a function's code, it should not be necessary to say "what's
variable X?" and go hunting in the commands that lead up to the
function call.  If X is used in the function, it should be in a named
argument, or extracted from one of the named arguments.  People who
rely on variables floating around in the user's environment are
creating hard-to-find bugs.

2. We don't want functions with indirect effects (no<<- ), almost always.

3. Code should be vectorized where possible, C style for loops over
vector members should be avoided.

4. We don't want gratuitous use of "return" at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good programming 
practice. It makes explicit what is being returned eliminating the 
possibility of mistakes and provides clarity for anyone reading the code.


David Scott



5. Neatness counts.  Code should look nice!  Check out how beautiful
the functions in MASS look! I want code with spaces and "<- " rather
than  everything jammed together with "=".

I don't mean to criticize any particular person's code in raising this
point.  For teaching exemples, where to focus?

Here's one candidate I've found:

MNP.  as far as I can tell, it meets the first 4 requirements.  And it
has some very clear C code with it as well. I'm only hesitant there
because I'm not entirely sure that a package's C code should introduce
its own functions for handling vectors and matrices, when some general
purpose library might be more desirable.  But that's a small point,
and clarity and completeness counts a great deal in my opinion.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: Suggestions for "good teaching" packages, esp. with C code

2011-02-15 Thread David Scott

On 16/02/2011 11:43 a.m., ken.willi...@thomsonreuters.com wrote:


On 2/15/11 4:35 PM, "Gabor Grothendieck"  wrote:


I think the real good programming practice is to have a single point
of exit at the bottom.


I disagree, it can be extremely useful to exit early from a function.  It
can also make the code much more clear by not having 95% of the body in a
huge else{} block.



If that is how you program all your functions
then you don't need to explicitly put a return in since it always
returns from the bottom anyways and the return would just clutter your
code.


For someone else reading your code, they wouldn't know that you always do
this unless they're very familiar with your coding style.  Even then, it
needs to be manually checked by inspection because nobody sticks with the
"rule" 100% of the time, so it renders the benefit moot.

--
Ken Williams
Senior Research Scientist
Thomson Reuters
Phone: 651-848-7712
ken.willi...@thomsonreuters.com
http://labs.thomsonreuters.com



Some interesting discussion on this point. Enlightening for me at least.

A quick test showed me that an explicit return does produce about a 20% 
time hit in a one-line function (obviously a lesser % in a non-trivial 
function) but enough to convince me not to use an explicit return in 
functions where what is being returned is obvious.


Gabor's point is a good one, there *should* be a single exit point at 
the bottom, but I have certainly had situations where an early exit 
seems preferable as Ken suggests. Then an explicit return may make the 
code sufficiently clear for a violation of Gabor's principle to be 
acceptable.


David Scott






--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] First package submission to CRAN

2011-06-22 Thread David Scott

 On 23/06/11 08:34, Christophe Dutang wrote:

Hi,

By default, R CMD build makes sources, you have to use --binary if you want to 
get binaries. But you have to submit sources to the CRAN ftp server (and not 
binary). So just run a R CMD build.

C

--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr


That is now deprecated Christophe. Recommended now is

R CMD INSTALL --build

to get a binary. See the recent thread with the subject

Porting "unmaintained" packages to post R 2.10.0 era

David Scott


Le 22 juin 2011 à 22:12, steven mosher a écrit :


I'm preparing to submit my first package to CRAN, thanks to the help of too
many people to mention.

I've built and checked the package on Windows  ( making a zip) and my path
points to the 64 bit version of R.

Everything builds and checks and the final warnings have been fixed. My
package is pure R with no source from

other languages.  My questions are  as follows. I've read the docs and just
need a bit of clarification.

1. For submission I should just build source  R CMD build mypkg  which
outputs a tar.gz
2. Do I have to/ how do I build for 32 bit?

Thanks,

Steve

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [FORGED] Re: Error in texi2dvi

2016-06-20 Thread David Scott



On 20/06/2016 11:50 p.m., Mikko Korpela wrote:

On 18/06/16 05:21, Spencer Graves wrote:

Hello:


   Changes in R seem to have broken the sos vignette, and I don't
know how to fix it.  The build on R-forge,
"https://r-forge.r-project.org/R/?group_id=235&log=build_src&pkg=sos&flavor=patched";, 


ends as follows:


Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = 
quiet,  :

   Running 'texi2dvi' on 'sos.tex' failed.
LaTeX errors:
! Undefined control sequence.
l.1 \Sconcordance
  {concordance:sos.tex:sos.Rnw:%
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
Calls:  -> texi2pdf -> texi2dvi
Execution halted



...


   What can I do to get this vignette to working again with the
least amount of work?


Try commenting out (removing) the following line in your vignette:

\SweaveOpts{concordance=TRUE}

The package builds just fine after doing that, on this computer. The 
R-Forge history of sos.Rnw shows the line was added on May 12, 
revision 237.



This is an annoying RStudio *feature*.

Much as I appreciate RStudio's contribution to the R community, I don't 
approve of software which adds code to your document without informing you.


David Scott

--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] issue with data()

2021-02-17 Thread David Scott
I would recommend option 2. I have done that when changes to xtable broke some 
packages. xtable has a number of dependencies but not on the scale of survival. 
Just 4 packages out of 868 seems minimal to me.

David Scott

On 17/02/2021 3:39 am, Therneau, Terry M., Ph.D. via R-devel wrote:
I am testing out the next release of survival, which involves running R CMD 
check on 868
CRAN packages that import, depend or suggest it.

The survival package has a lot of data sets, most of which are non-trivial real 
examples
(something I'm proud of).  To save space I've bundled many of them, .e.g., 
data/cancer.rda
has 19 different dataframes.

This caused failures in 4 packages, each because they have a line such as 
"data(lung)"  or
data(breast, package= "survival"); and the data() command looks for a file name.

This is a question about which option is considered the best (perhaps more of a 
poll),
between two choices

1. unbundle them again  (it does save 1/3 of the space, and I do get complaints 
from R CMD
build about size)
2. send notes to the 4 maintainers.  The help files for the data sets have the 
usage
documented as  "lung" or "breast", and not data(lung), so I am technically 
legal to claim
they have a mistake.

A third option to make the data sets a separate package is not on the table.  I 
use them
heavily in my help files and test suite, and since survival is a recommended 
package I
can't add library(x) statements for  !(x %in% recommended).   I am guessing 
that this
would also break many dependent packages.

Terry T.

--
Terry M Therneau, PhD
Department of Health Science Research
Mayo Clinic
thern...@mayo.edu<mailto:thern...@mayo.edu>

"TERR-ree THUR-noh"


[[alternative HTML version deleted]]

__
R-devel@r-project.org<mailto:R-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel<https://stat.ethz.ch/mailman/listinfo/r-devel>


--
_
David Scott
Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Email:d.sc...@auckland.ac.nz<mailto:d.sc...@auckland.ac.nz>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread David Scott
I am surprised nobody so far has mentioned glue which is an 
implementation in R of a python idiom.

It is a reverse import in a great number of R packages on CRAN. It 
specifies how some of the special cases so far considered are treated 
which seems an advantage:

 > library(glue)
 > glue(NA, 2)
NA2
 > glue(NA, 2, .sep = " ")
NA 2
 > glue(NA, 2, .na = NULL)
NA

David Scott

On 7/12/2021 1:20 pm, Gabriel Becker wrote:
> As I recall, there was a large discussion related to that which 
> resulted in
> the recycle0 argument being added (but defaulting to FALSE) for
> paste/paste0.
>
> I think a lot of these things ultimately mean that if there were to be a
> string concatenation operator, it probably shouldn't have behavior
> identical to paste0. Was that what you were getting at as well, Bill?
>
> ~G
>
> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap  
> wrote:
>
> > Should paste0(character(0), c("a","b")) give character(0)?
> > There is a fair bit of code that assumes that paste("X",NULL) gives "X"
> > but c(1,2)+NULL gives numeric(0).
> >
> > -Bill
> >
> > On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
> > wrote:
> >
> >> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
> >> > Gabe, I agree that missingness is important to factor in. To somewhat
> >> abuse
> >> > the terminology, NA is often used to represent missingness. Perhaps
> >> > concatenating character something with character something missing
> >> should
> >> > result in the original character?
> >>
> >> I think that's a bad idea. If you wanted to represent an empty string,
> >> you should use "" or NULL, not NA.
> >>
> >> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it 
> should
> >> give NA.
> >>
> >> Duncan Murdoch
> >>
> >> >
> >> > Avi
> >> >
> >> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
> >> wrote:
> >> >
> >> >> Hi All,
> >> >>
> >> >> Seeing this and the other thread (and admittedly not having clicked
> >> through
> >> >> to the linked r-help thread), I wonder about NAs.
> >> >>
> >> >> Should NA  "hi there" not result in NA_character_? This 
> is not
> >> >> what any of the paste functions do, but in my opinoin, NA +
> >> 
> >> >> seems like it should be NA (not "NA"), particularly if we are 
> talking
> >> >> about `+` overloading, but potentially even in the case of a 
> distinct
> >> >> concatenation operator?
> >> >>
> >> >> I guess what I'm saying is that in my head missingness propagation
> >> rules
> >> >> should take priority in such an operator (ie NA +  should
> >> >> *always * be NA).
> >> >>
> >> >> Is that something others disagree with, or has it just not come 
> up yet
> >> in
> >> >> (the parts I have read) of this discussion?
> >> >>
> >> >> Best,
> >> >> ~G
> >> >>
> >> >> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
> 
> >> >> wrote:
> >> >>
> >> >>>>> In pqR (see pqR-project.org), I have implemented ! and !! as 
> binary
> >> >>>>> string concatenation operators, equivalent to paste0 and paste,
> >> >>>>> respectively.
> >> >>>>>
> >> >>>>> For instance,
> >> >>>>>
> >> >>>>> > "hello" ! "world"
> >> >>>>> [1] "helloworld"
> >> >>>>> > "hello" !! "world"
> >> >>>>> [1] "hello world"
> >> >>>>> > "hello" !! 1:4
> >> >>>>> [1] "hello 1" "hello 2" "hello 3" "hello 4"
> >> >>>>
> >> >>>> I'm curious about the details:
> >> >>>>
> >> >>>> Would `1 ! 2` convert both to strings?
> >> >>>
> >> >>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", 
> just
> >> >>> like paste0(1,2) does. Of course, they wouldn't have to be exactly
> >> >>> equivalent

Re: [Rd] hist(..., log="y")

2023-08-07 Thread David Scott
Log histograms are of particular interest when dealing with heavy tailed 
data/distributions.

It is not just a matter of using a log scale on the y axis though 
because the base line of the histogram is at zero and the log of zero is 
minus infinity.

I have implemented a version of a log histogram in the function logHist, 
in my package DistributionUtils, which may be of interest if anyone 
seriously wishes to add functionality to the base hist function.

David Scott

On 7/08/2023 8:54 pm, Martin Maechler wrote:
> >>>>> Ott Toomet
> >>>>> on Sat, 5 Aug 2023 23:49:38 -0700 writes:
>
> > Sorry if this topic has been discussed earlier.
>
> > Currently, hist(..., log="y") fails with
>
> >> hist(rexp(1000, 1), log="y")
> > Warning messages: 1: In plot.window(xlim, ylim, "", ...) :
> > nonfinite axis=2 limits [GScale(-inf,2.59218,..);
> > log=TRUE] -- corrected now 2: In title(main = main, sub =
> > sub, xlab = xlab, ylab = ylab, ...) : "log" is not a
> > graphical parameter 3: In axis(1, ...) : "log" is not a
> > graphical parameter 4: In axis(2, at = yt, ...) : "log" is
> > not a graphical parameter
>
> > The same applies for log="x"
>
> [...]
>
> > This applies for the current svn version of R, and also a
> > few recent published versions. This is unfortunate for
> > two reasons:
>
> > * the error message is not quite correct--"log" is a
> > graphical parameter, but "hist" does not support it.
>
> No, not if you use R's (or S's before that) definition:
>
> graphical parameters := {the possible argument of par()}
>
> log is *not* among these.
>
>
> > * for various kinds of data it is worthwhile to make
> > histograms in log scale. "hist" is a very nice and
> > convenient function and support for log scale would be
> > handy here.
>
> Yes, possibly (see below).
> Note that the above are not errors, but warnings,
> and there *is* some support, e.g.,
>
> > set.seed(1); range(x <- rlnorm())
> [1] 0.04938796 45.16293285
> > hx <- hist(x, log="x", xlim=c(0.049, 47))
> Warning messages:
> 1: In title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...) :
> "log" is not a graphical parameter
> 2: In axis(1, ...) : "log" is not a graphical parameter
> 3: In axis(2, at = yt, ...) : "log" is not a graphical parameter
>
> > str(hx)
> List of 6
> $ breaks : num [1:11] 0 5 10 15 20 25 30 35 40 45 ...
> $ counts : int [1:10] 1041 58 10 0 1 0 0 0 0 1
> $ density : num [1:10] 0.1874 0.01044 0.0018 0 0.00018 ...
> $ mids : num [1:10] 2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5
> $ xname : chr "x"
> $ equidist: logi TRUE
> - attr(*, "class")= chr "histogram"
>
> where we see that it *does* plot ... but crucially not the very first bin,
> because log(0) == -Inf, with over 90% (viz. 1041) counts.
>
> > I also played a little with the code, and it seems to be
> > very easy to implement. I am happy to make a patch if the
> > team thinks it is worth pursuing.
>
> > Cheers, Ott
>
> Yeah.. and that's is the important question.
>
> Most statisticians know that a histogram is a pretty bad
> density estimator (notably if the natural density has an
> infinite support) compared to simple kernel density estimates,
> e.g. those by density().
> Hence, I'd argue that if you expect enough sophistication from
> your "viewer"s to understand a log-scale histogram, I'd say you
> should use a density with log="x" and or "y" and I I have
> successfully done so several times: It *does* work
> {particularly nicely if you use my sfsmisc::eaxis() for the log axis/es}.
>
> But you (and others) may have more good arguments why hist()
> should work with log="x" and/or log="y"...
>
> Also if your patch relatively small, its usefulness may
> outweigh the added complexity (and its long-term maintenance !).
>
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel 
> <https://stat.ethz.ch/mailman/listinfo/r-devel>

-- 
_
David Scott
Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Email:d.sc...@auckland.ac.nz


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Undocumented functions

2011-12-16 Thread David Scott
One easy way is to list the undocumented files in pkg-internal.Rd. From 
the Writing R Extensions manual:


Note that all user-level objects in a package should be documented; if a 
package pkg contains user-level objects which are for “internal” use 
only, it should provide a file pkg-internal.Rd which documents all such 
objects, and clearly states that these are not meant to be called by the 
user. See e.g. the sources for package *grid* in the R distribution for 
an example.


Probably a perverse use of this facility, but it works, and will even 
allow the package to pass check.


David Scott


On 16/12/2011 1:01 a.m., Nicola Sturaro Sommacal wrote:

Hi!

I am building a package. This package will not submitted to CRAN.

I write the help files for the most important functions of my package, I
cannot write it for all functions. This may sounds strange, but so there!

I know that all user-level functions should be documented, so I have to
move my undocumented functions to a non-user-level. It's right?

To move my functions to a non-user-level I can write them as hidden
functions, with a dot before the names. This require a very long check of
my code to change the call to the function preceding it by a dot. So, this
is not a real choice.
There are other way to reach my purpose?

Thank you very much for help.

Sincerely,
Nicola

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to organized code in the R/ directory of a package?

2009-12-10 Thread David Scott

Tobias Verbeke wrote:

Peng Yu wrote:

I'm making a package, Current, I put all R files in the R/ directory
in the package (without using subdirectory). This will become a
problem when there are many files in the directory. I'm wondering how
to use subdirectories in R/?


The standard solution is (I would think) to organize the code
such that functions belonging together are grouped in one file.



That is what I try to do. My rule is that functions which are documented 
in the same .Rd file are also in the same .R file. Might be more 
sensible for some packages than for others.


David

_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Using SVN + SSH on windows

2010-03-27 Thread David Scott

Uwe Ligges wrote:
It is really not hard to set it up. I am using a vanilla ssh (rather 
than putty) and that works fine all the time...


Uwe Ligges



Ditto here. I am using ssh non commercial version with tortoise on 
Vista, and I don't recall any problems setting it up. R-forge works 
perfectly fine with windows and tortoise in my experience.


Is your putty/ssh working? Can you access other machines with it? I do 
recall ssh can be a bit fussy.


David Scott




On 27.03.2010 18:31, Gabor Grothendieck wrote:

s getting commits to R-Forge to work from
Windows.  The entire system is really geared to UNIX.  It took me a
couple of days of trial and error (since you have to wait 20 minutes
for each try) before I got it working.  Although I did get it to work,
I ultimately decided to host all my packages on googlecode.
googlecode is extremely easy to use from Windows and does not require
any public/private key, pageant, etc.  e.g.
http://sqldf.googlecode.com.  If you already have TortoiseSVN and know
how to use it then you can probably set up a googlecode site in
literally 5 minutes.

One other possibility.  I think there is a way to host your project on
googlecode but still have it mirrored on R-forge so from your users'
viewpoint its the same as if it were on R-Forge but you can use the
simpler googlecode site.  In that case you might not need to set up
commits on R-Forge since you would do all your commits through
googlecode (depending on how it works) but I have not seen good
documentation on how to do this.



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
_____
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] section needed in FAQ - Using R (PR#9698)

2007-05-21 Thread David Scott
>>
>> By the way, the R program itself is really badly in need of an
>> "appropos()" to
>>

I don't normally pick on spelling errors and my reply is out of sequence 
because I deleted the Peter's message replying on this.

?appropos

will give a not found reply

?apropos

will give some useful information

David Scott


_____
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] NEWS file

2008-04-14 Thread David Scott

Is there Emacs support for creating a NEWS file for a package? If so where 
could I find it? I had a look at the GNU coding standards on documenting 
programs. It has a bit on Emacs and Change Logs but not concerning a NEWS 
file as far as I could see.

David Scott

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel