[Rd] [patch] Fix typo in 'rank' documentation

2017-08-24 Thread Jonathan Armond
I noticed a typo in the documentation for the 'rank' function.
Specifically, it describes ties.method="first" and contrasts with...
ties.method="first", when it should be ties.method="last".

Thanks,
Jon
Index: src/library/base/man/rank.Rd
===
--- src/library/base/man/rank.Rd(revision 73116)
+++ src/library/base/man/rank.Rd(working copy)
@@ -34,7 +34,7 @@
   (called \sQuote{ties}), the argument \code{ties.method} determines the
   result at the corresponding indices.  The \code{"first"} method results
   in a permutation with increasing values at each index set of ties, and
-  analogously \code{"first"} with decreasing values.  The
+  analogously \code{"last"} with decreasing values.  The
   \code{"random"} method puts these in random order whereas the
   default, \code{"average"}, replaces them by their mean, and
   \code{"max"} and \code{"min"} replaces them by their maximum and
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] Fix typo in 'rank' documentation

2017-08-24 Thread Martin Maechler
> Jonathan Armond 
> on Thu, 24 Aug 2017 07:10:08 +0100 writes:

> I noticed a typo in the documentation for the 'rank' function.
> Specifically, it describes ties.method="first" and contrasts with...
> ties.method="first", when it should be ties.method="last".

> Thanks,
> Jon

> --
> Index: src/library/base/man/rank.Rd
> ===
> --- src/library/base/man/rank.Rd  (revision 73116)
> +++ src/library/base/man/rank.Rd  (working copy)
...

Thank you, Jon.

Seems my mistake, I'm going to correct it.

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Are r2dtable and C_r2dtable behaving correctly?

2017-08-24 Thread Gustavo Fernandez Bayon
Hello,

While doing some enrichment tests using chisq.test() with simulated
p-values, I noticed some strange behaviour. The computed p-value was
extremely small, so I decided to dig a little deeper and debug
chisq.test(). I noticed then that the simulated statistics returned by the
following call

tmp <- .Call(C_chisq_sim, sr, sc, B, E)

were all the same, very small numbers. This, at first, seemed strange to
me. So I decided to do some simulations myself, and started playing around
with the r2dtable() function. Problem is, using my row and column
marginals, r2dtable() always returns the same matrix. Let's provide a
minimal example:

rr <- c(209410, 276167)
cc <- c(25000, 460577)
ms <- r2dtable(3, rr, cc)

I have tested this code in two machines and it always returned the same
list of length three containing the same matrix three times. The repeated
matrix is the following:

[[1]]
  [,1]   [,2]
[1,] 10782 198628
[2,] 14218 261949

[[2]]
  [,1]   [,2]
[1,] 10782 198628
[2,] 14218 261949

[[3]]
  [,1]   [,2]
[1,] 10782 198628
[2,] 14218 261949

I also coded a small function returning the value of the chi-squared
statistic using the previous fixed marginals and taking the value at [1, 1]
as input. This helped me to plot a curve and notice that the repeating
matrix was the one that yielded the minimum chi-squared statistic.

This behaviour persists if I use greater marginals (summing 10 to every
element of the marginal for example),

> rr <- c(309410, 376167)
> cc <- c(125000, 560577)
> r2dtable(3, rr, cc)
[[1]]
  [,1]   [,2]
[1,] 56414 252996
[2,] 68586 307581

[[2]]
  [,1]   [,2]
[1,] 56414 252996
[2,] 68586 307581

[[3]]
  [,1]   [,2]
[1,] 56414 252996
[2,] 68586 307581

 but not if we use smaller ones:

> rr <- c(9410, 76167)
> cc <- c(25000, 60577)
> r2dtable(3, rr, cc)
[[1]]
  [,1]  [,2]
[1,]  2721  6689
[2,] 22279 53888

[[2]]
  [,1]  [,2]
[1,]  2834  6576
[2,] 22166 54001

[[3]]
  [,1]  [,2]
[1,]  2778  6632
[2,] 2 53945

I have looked inside the C code for the C_r2dtable() and rcont2()
functions, but I cannot do much more than guess where this behaviour could
originate, so I would like to ask for help from anybody more experienced in
the R implementation. Guess there is some kind of inflection point
depending on the total sample size of the table, or maybe the generation
algorithm tends to output matrices concentrated around the minimum.

This is the output from my sessionInfo()

> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=es_ES.UTF-8   LC_NUMERIC=C
LC_TIME=es_ES.UTF-8
 [4] LC_COLLATE=es_ES.UTF-8 LC_MONETARY=es_ES.UTF-8
 LC_MESSAGES=es_ES.UTF-8
 [7] LC_PAPER=es_ES.UTF-8   LC_NAME=C  LC_ADDRESS=C

[10] LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
 methods   base

other attached packages:
 [1] profvis_0.3.3   bindrcpp_0.2

 [3] FDb.InfiniumMethylation.hg19_2.2.0  org.Hs.eg.db_3.4.1

 [5] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.28.4

 [7] AnnotationDbi_1.38.2Biobase_2.36.2

 [9] GenomicRanges_1.28.4GenomeInfoDb_1.12.2

[11] IRanges_2.10.2  S4Vectors_0.14.3

[13] BiocGenerics_0.22.0 epian_0.1.0


loaded via a namespace (and not attached):
 [1] SummarizedExperiment_1.6.3 purrr_0.2.3reshape2_1.4.2

 [4] lattice_0.20-35htmltools_0.3.6
 rtracklayer_1.36.4
 [7] blob_1.1.0 XML_3.98-1.9   rlang_0.1.2

[10] foreign_0.8-67 glue_1.1.1 DBI_0.7

[13] BiocParallel_1.10.1bit64_0.9-7
 matrixStats_0.52.2
[16] GenomeInfoDbData_0.99.0bindr_0.1  plyr_1.8.4

[19] stringr_1.2.0  zlibbioc_1.22.0
 Biostrings_2.44.2
[22] htmlwidgets_0.9psych_1.7.5memoise_1.1.0

[25] biomaRt_2.32.1 broom_0.4.2Rcpp_0.12.12

[28] DelayedArray_0.2.7 XVector_0.16.0 bit_1.1-12

[31] Rsamtools_1.28.0   mnormt_1.5-5   digest_0.6.12

[34] stringi_1.1.5  dplyr_0.7.2grid_3.4.0

[37] tools_3.4.0bitops_1.0-6   magrittr_1.5

[40] RCurl_1.95-4.8 tibble_1.3.3   RSQLite_2.0

[43] tidyr_0.7.0pkgconfig_2.0.1Matrix_1.2-9

[46] assertthat_0.2.0   R6_2.2.2
GenomicAlignments_1.12.1
[49] nlme_3.1-131   compiler_3.4.0

Any hint or help would be much appreciated. We do not use a lot the
simulated version of the chisq.test at the lab, but I would like to
understand better what is happening.

Kind regards,
Gustavo

[[alternative HTM

[Rd] loop compilation problem

2017-08-24 Thread Lukas Stadler
Hi!

We’ve seen a problem with the compiler in specific cases of matrix updates:

> { m <- matrix(1:4, 2) ; z <- 0; for(i in 1) { m[z <- z + 1,z <- z + 1] <- 99; 
> } ; m }
 [,1] [,2]
[1,]13
[2,]2   99

Here, it modifies element [2,2], which is unexpected.
It behaves correct without the loop:

> { m <- matrix(1:4, 2) ; z <- 0; m[z <- z + 1,z <- z + 1] <- 99 ; m }
 [,1] [,2]
[1,]1   99
[2,]24

… and without the jit:

> enableJIT(0)
[1] 3
> { m <- matrix(1:4, 2) ; z <- 0; for(i in 1) { m[z <- z + 1,z <- z + 1] <- 99; 
> } ; m }
 [,1] [,2]
[1,]1   99
[2,]24

I checked with "R Under development (unstable) (2017-08-23 r73116)”, and the 
problem is still there.

- Lukas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] loop compilation problem

2017-08-24 Thread luke-tierney

Thanks.

Here is a simplified version:

library(compiler)
zero <- 0
one <- 1
expr <- quote((z <- zero + one) + (z <- z + 1))
eval(compiler::compile(expr))

Sill fix shortly.

Best,

luke

On Thu, 24 Aug 2017, Lukas Stadler wrote:


Hi!

We’ve seen a problem with the compiler in specific cases of matrix updates:


{ m <- matrix(1:4, 2) ; z <- 0; for(i in 1) { m[z <- z + 1,z <- z + 1] <- 99; } 
; m }

[,1] [,2]
[1,]13
[2,]2   99

Here, it modifies element [2,2], which is unexpected.
It behaves correct without the loop:


{ m <- matrix(1:4, 2) ; z <- 0; m[z <- z + 1,z <- z + 1] <- 99 ; m }

[,1] [,2]
[1,]1   99
[2,]24

… and without the jit:


enableJIT(0)

[1] 3

{ m <- matrix(1:4, 2) ; z <- 0; for(i in 1) { m[z <- z + 1,z <- z + 1] <- 99; } 
; m }

[,1] [,2]
[1,]1   99
[2,]24

I checked with "R Under development (unstable) (2017-08-23 r73116)”, and the 
problem is still there.

- Lukas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Bug in tools::toTitleCase

2017-08-24 Thread Carl Ganz
Hello,

I believe there is a bug in tools::toTitleCase, because it converts NAs
into the string "NA".

tools::toTitleCase(NA_character_)

The issue appears to be with the C function splitString since this also
returns "NA":

.Call('C_splitString', NA_character_, " -/\"()\n", PACKAGE = "tools")

Kind Regards,
Carl Ganz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel