Re: [Rd] RFC: Proposal to make NROW() and NCOL() slightly more general

2012-02-07 Thread Martin Maechler
> Martin Maechler 
> on Mon, 6 Feb 2012 15:35:36 +0100 writes:

>> On Sat, Feb 4, 2012 at 10:38 AM, Martin Maechler
>>  wrote:
>> > The help has
>> >
>> >> Description:
>> >
>> >>   'nrow' and 'ncol' return the number of rows or columns present in 
'x'.
>> >>   'NCOL' and 'NROW' do the same treating a vector as 1-column matrix.
>> >
>> > and
>> >
>> >>   x: a vector, array or data frame
>> >
>> > I'm proposing to extend these two convenience functions
>> > to also work ``correctly'' for generalized versions of matrices.
>> >
>> >
>> > The current implementation :
>> >
>> > NROW <- function(x) if(is.array(x)||is.data.frame(x)) nrow(x) else 
length(x)
>> > NCOL <- function(x) if(is.array(x) && length(dim(x)) > 1L || 
is.data.frame(x)) ncol(x) else 1L
>> >
>> > only treats something as matrix when  is.array(.) is true,
>> > which is not the case, e.g., for multiprecision matrices from
>> > package 'gmp' or for matrices from packages SparseM, Matrix or similar.
>> >
>> > Of course, all these packages could write methods for NROW, NCOL
>> > for their specific matrix class, but given that the current
>> > definition is so simple,
>> > I'd find it an unnecessary complication.
>> >
>> > Rather I propose the following new version
>> >
>> > NROW <- function(x) if(length(dim(x)) || is.data.frame(x)) nrow(x) 
else length(x)
>> > NCOL <- function(x) if(length(dim(x)) > 1L || is.data.frame(x)) 
ncol(x) else 1L

>> That makes me wonder about:

>> DIM <- function(x) if (length(dim(x)) > 1L) dim(x) else c(length(x), 1L)

>> or maybe more efficiently:

>> DIM <- function(x) {
>> d <- dim(x)
>> if (length(d) > 1L) dim(x) else c(length(x), 1L)
>> }

>> given that dim() is not always trivial to compute (e.g. for data
>> frames it can be rather slow if you're doing it for hundreds of data
>> frames)

>> then NROW and NCOL could be exact equivalents to nrow and ncol.

>> Hadley

> Thank you, Hadley.
> Indeed, your suggestion seems to make sense
> {as far as it makes sense to have such simple functions to
> exist in base at all, but as we already have NROW and NCOL ..}

> So, I propose to adopt Hadley's  DIM() proposal, modified to

> DIM <- function(x) if(length(d <- dim(x))) d else c(length(x), 1L)

> and wait a day or so (or longer for reasons of vacation!) before
> committing it, so the public can raise opinions.

Actually, the above --- building NROW() and NCOL() ond DIM() is
not quite correct:
 
  NCOL <- function(x) DIM(x)[2L]

will fail for   x <- array(1:3, 3)

so I think I'll stick for now with the generalizations to NROW()
and NCOL(), using

NROW <- function(x) if(length(d <- dim(x)))  d[1L] else length(x)
NCOL <- function(x) if(length(d <- dim(x)) > 1L) d[2L] else 1L

which incorporates Hadley's note that there are case where
dim(.) is ``relatively expensive''.
Note that the above are also (very slightly) more efficient than
basing them on DIM(.).

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Using custom R_LIBS with R CMD install

2012-02-07 Thread Hadley Wickham
Hi all,

Am I using the correct syntax to set a custom R_LIBS when running R
CMD INSTALL from the command line?

I get:

R_LIBS=/Users/hadley/R-dev R CMD INSTALL aL3xa-rapport-08e68ca/
# Desktop : R_LIBS=/Users/hadley/R-dev R CMD INSTALL aL3xa-rapport-08e68ca/
# * installing to library ‘/Users/hadley/R’
# ERROR: dependency ‘ascii’ is not available for package ‘rapport’

But:

ls /Users/hadley/R-dev/
# HandyStuff animation  biOps  mcmcTools  mvbutils   quantreg   testthat
# SparseMascii  ggplot2mturkr nstest scales whisker

(I'm actually running this from inside R using system2, but I think
this is the essence of my misunderstanding)

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Canonical package directory name for JAR files?

2012-02-07 Thread Roebuck,Paul L
We have an R package which needs to include a JAR file.
Is there a canonical directory for it?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Canonical package directory name for JAR files?

2012-02-07 Thread Simon Urbanek

On Feb 7, 2012, at 4:34 PM, Roebuck,Paul L wrote:

> We have an R package which needs to include a JAR file.
> Is there a canonical directory for it?
> 

rJava defines "java" for that purpose (see ?.jpackage). How canonical that is 
may be open for debate ;)


Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] capture.output() is trying to allocate 17179869182.6 Gb on my not so big data.frame

2012-02-07 Thread Hervé Pagès

Hi,

This is what I get with recent R devel on a 64-bit Ubuntu laptop:

  > mydf <- data.frame(a=1:2080, b=1001:2040, c=letters, d=LETTERS, 
e=1:1040)

  > mydf_in_a_character_vector <- capture.output(mydf)
  Error in print.default(m, ..., quote = quote, right = right) :
cannot allocate memory block of size 17179869182.6 Gb

I get something similar with R 2.14.1.

Cheers,
H.

> sessionInfo()
R Under development (unstable) (2012-01-16 r58124)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] capture.output() is trying to allocate 17179869182.6 Gb on my not so big data.frame

2012-02-07 Thread Martin Morgan

On 02/07/2012 04:08 PM, Hervé Pagès wrote:

Hi,

This is what I get with recent R devel on a 64-bit Ubuntu laptop:

 > mydf <- data.frame(a=1:2080, b=1001:2040, c=letters, d=LETTERS,
e=1:1040)
 > mydf_in_a_character_vector <- capture.output(mydf)
Error in print.default(m, ..., quote = quote, right = right) :
cannot allocate memory block of size 17179869182.6 Gb


The error is thrown inside src/main/printarray.c:425

Rprintf("%*s%s", R_print.gap, "",
EncodeString(x[i + j * r], w[j], quote, right));

where the array w is the result of an unPROTECTed allocation earlier in 
the function, and a garbage collection triggered in MatrixRowLabel (in 
this case; allocation also occurs in MatrixColLabel, Rprintf). 
PROTECTion seems to have been implemented in the file assuming that the 
only allocations are at the head of the function; the return in the 
_PRINT_DEAL_c_eq_0 macro makes it difficult to balance the protection 
stack, and R_alloc seems to be a better solution anyway. diff attached.


Martin Morgan



I get something similar with R 2.14.1.

Cheers,
H.

 > sessionInfo()
R Under development (unstable) (2012-01-16 r58124)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base




--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793
Index: src/main/printarray.c
===
--- src/main/printarray.c	(revision 58290)
+++ src/main/printarray.c	(working copy)
@@ -137,7 +137,6 @@
 /* initialization; particularly of row labels, rl= dimnames(.)[[1]] and
  * rn = names(dimnames(.))[1] : */
 #define _PRINT_INIT_rl_rn\
-SEXP sw;		\
 int *w;		\
 int width, rlabw = -1, clabw = -1; /* -Wall */	\
 int i, j, jmin = 0, jmax = 0, lbloff = 0;		\
@@ -159,9 +158,8 @@
 
 _PRINT_INIT_rl_rn;
 
-sw = allocVector(INTSXP, c);
 x = LOGICAL(sx) + offset;
-w = INTEGER(sw);
+w = (int *) R_alloc(c, sizeof(int));
 /* compute w[j] = column-width of j(+1)-th column : */
 for (j = 0; j < c; j++) {
 	formatLogical(&x[j * r], r, &w[j]);
@@ -234,9 +232,8 @@
 
 _PRINT_INIT_rl_rn;
 
-sw = allocVector(INTSXP, c);
 x = INTEGER(sx) + offset;
-w = INTEGER(sw);
+w = (int *) R_alloc(c, sizeof(int));
 for (j = 0; j < c; j++) {
 	formatInteger(&x[j * r], r, &w[j]);
 	_PRINT_SET_clabw;
@@ -271,19 +268,14 @@
 static void printRealMatrix(SEXP sx, int offset, int r_pr, int r, int c,
 			SEXP rl, SEXP cl, const char *rn, const char *cn)
 {
-SEXP sd, se;
 double *x;
 int *d, *e;
 _PRINT_INIT_rl_rn;
 
-PROTECT(sd = allocVector(INTSXP, c));
-PROTECT(se = allocVector(INTSXP, c));
-sw = allocVector(INTSXP, c);
-UNPROTECT(2);
 x = REAL(sx) + offset;
-d = INTEGER(sd);
-e = INTEGER(se);
-w = INTEGER(sw);
+d = (int *) R_alloc(c, sizeof(int));
+e = (int *) R_alloc(c, sizeof(int));
+w = (int *) R_alloc(c, sizeof(int));
 
 for (j = 0; j < c; j++) {
 	formatReal(&x[j * r], r, &w[j], &d[j], &e[j], 0);
@@ -319,27 +311,18 @@
 static void printComplexMatrix(SEXP sx, int offset, int r_pr, int r, int c,
 			   SEXP rl, SEXP cl, const char *rn, const char *cn)
 {
-SEXP sdr, ser, swr, sdi, sei, swi;
 Rcomplex *x;
 int *dr, *er, *wr, *di, *ei, *wi;
 _PRINT_INIT_rl_rn;
 
-PROTECT(sdr = allocVector(INTSXP, c));
-PROTECT(ser = allocVector(INTSXP, c));
-PROTECT(swr = allocVector(INTSXP, c));
-PROTECT(sdi = allocVector(INTSXP, c));
-PROTECT(sei = allocVector(INTSXP, c));
-PROTECT(swi = allocVector(INTSXP, c));
-PROTECT(sw	= allocVector(INTSXP, c));
-UNPROTECT(7);
 x = COMPLEX(sx) + offset;
-dr = INTEGER(sdr);
-er = INTEGER(ser);
-wr = INTEGER(swr);
-di = INTEGER(sdi);
-ei = INTEGER(sei);
-wi = INTEGER(swi);
-w = INTEGER(sw);
+dr = (int *) R_alloc(c, sizeof(int));
+er = (int *) R_alloc(c, sizeof(int));
+wr = (int *) R_alloc(c, sizeof(int));
+di = (int *) R_alloc(c, sizeof(int));
+ei = (int *) R_alloc(c, sizeof(int));
+wi = (int *) R_alloc(c, sizeof(int));
+w = (int *) R_alloc(c, sizeof(int));
 
 /* Determine the column widths */
 
@@ -391,9 +374,8 @@
 SEXP *x;
 _PRINT_INIT_rl_rn;
 
-sw = allocVector(INTSXP, c);
 x = STRING_PTR(sx)+offset;
-w = INTEGER(sw);
+w = (int *) R_alloc(c, sizeof(int));
 for (j = 0; j < c; j++) {
 	formatString(&x[j * r], r, &w[j], quote);
 	_PRINT_SET_clabw;
@@ -437,9 +419,8 @@
 Rbyte *x;
 _PRINT_INIT_rl_rn;
 
-sw = allocVector(INTSXP, c);
 x