Re: [Rd] warning for inefficiently compressed datasets
On 06.12.2011 23:28, Hervé Pagès wrote: Hi, Recently added to doc/NEWS.Rd: 'R CMD check' now gives a warning rather than a note if it finds inefficiently compressed datasets. With 'bzip2' and 'xz' compression having been available since R 2.10.0, there is no excuse for not using them. Why isn't a note enough for this? Generally speaking, warnings are for things that are dangerous, or unsafe, or unportable, or for anything that could potentially cause trouble. I don't see how using gzip instead of bzip2 or xz could fall into that category (and BTW gzip is the default for save() and for 'R CMD build' resave-data feature). The problem is that bzip2 and xz compressions are slower and also require more memory than gzip. Bioconductor has big data packages and sometimes it makes sense to use gzip and not bzip2 or xz. For example, when loading Human chromosome 1 from disk, bzip2 and xz are 7 and 3.4 times slower than gzip, respectively: > system.time(load("chr1-gzip.rda")) user system elapsed 1.210 0.180 1.384 > system.time(load("chr1-bzip2.rda")) user system elapsed 9.500 0.160 9.674 > system.time(load("chr1-xz.rda")) user system elapsed 4.46 0.20 4.69 hpages@latitude:~/testing$ ls -lhtr chr1-*.rda -rw-r--r-- 1 hpages hpages 61M 2011-12-06 12:13 chr1-gzip.rda -rw-r--r-- 1 hpages hpages 55M 2011-12-06 12:15 chr1-bzip2.rda -rw-r--r-- 1 hpages hpages 49M 2011-12-06 12:25 chr1-xz.rda This is with R-2.14.0 on a 64-bit Ubuntu laptop with 8GB of RAM. The size on disk doesn't really matter and it doesn't matter either that the source tarball for the full Human genome ends up being 20% bigger when using gzip instead of xz: the 20% extra time it takes to download it (which needs to be done only once) will largely be compensated by the fact that most analyses will run faster e.g. in 40-45 sec. instead of more than 2 minutes (for many short analyses, loading the chromosomes into memory is the bottleneck). Oh, from a European side this 20% extra time may be an hour when downloading from the BioC master rather than a mirror. And space and traffic is an issue for CRAN. Is there a way to turn this warning off? If not, could an option be added to 'R CMD check' to turn this warning off? Something along the lines of the --no-resave-data option for 'R CMD build'. The manual tells us: "The following environment variables can be used to customize the operation of check: a convenient place to set these is the file ‘~/.R/check.Renviron’. [...] _R_CHECK_COMPACT_DATA2_ If true, check data for ascii and uncompressed saves, and also check if using bzip2 or xz compression would be significantly better. Implies _R_CHECK_COMPACT_DATA_ is true. Default: true." Uwe Thanks, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Unable to collate and parse R files with R CMD check
Hi all, I'm trying to build a package on Windows 7 (64 bit) and although R cmd build worked fine and I got pkg.tar.gz with no errors, but when I tried doing R CMD check everything turns out ok except for the warning: "checking whether package 'pkg' can be installed ... ERROR. Installation failed." It then refers me to the error file "00install.out" which reads as follows: * installing *source* package 'pkg' ... ** R Error in parse(outFile) : 94:0: unexpected end of input 92: 93: dat ^ ERROR: unable to collate and parse R files for package 'pkg' * removing 'C:/PROGRA~1/R/R-214~1.0/bin/x64/PKG~1.RCH/pkg' I have no idea what this means and I was not able to find any similar errors online. Any help would be appreciated. Thanks in advance, Sumukh [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Graphics device hook to manipulate plotmath
Is there a hook that allows a graphics device to apply transformations to plotmath expressions *before* they are rendered? If there isn't one yet, would it be feasible to add one? The motivation for this hook is graphic devices that feed into something that already has support for math layout, such as the tikzDevice package (which has TeX downstream). Given text(x, y, expression(alpha+beta+gamma+delta)) it would be ideal (in terms of output quality) if tikzDevice could process that as if text(x, y, "$\\alpha+\\beta+\\gamma+\\delta$") had been written instead. This would also be easier to *implement*, from the device side, than a back-conversion from Adobe-Symbol glyph requests to TeX math symbol macros. (Users of tikzDevice can of course write all their TeX math expressions directly, but this may be a great deal of conversion work, and is also inconvenient for someone tweaking their plots in one of the interactive graphics devices before saving them permanently.) Thanks in advance, zw p.s. I am not subscribed to this list, please cc: me on replies. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unable to collate and parse R files with R CMD check
On 06.12.2011 20:05, Sumukh Sathnur wrote: Hi all, I'm trying to build a package on Windows 7 (64 bit) and although R cmd build worked fine and I got pkg.tar.gz with no errors, but when I tried doing R CMD check everything turns out ok except for the warning: "checking whether package 'pkg' can be installed ... ERROR. Installation failed." It then refers me to the error file "00install.out" which reads as follows: * installing *source* package 'pkg' ... ** R Error in parse(outFile) : 94:0: unexpected end of input 92: 93: dat Oh, it means R cannot even parse the code. So try to source() each file in your ./R directory separately. You will find that at least one won't work. Uwe Ligges ^ ERROR: unable to collate and parse R files for package 'pkg' * removing 'C:/PROGRA~1/R/R-214~1.0/bin/x64/PKG~1.RCH/pkg' I have no idea what this means and I was not able to find any similar errors online. Any help would be appreciated. Thanks in advance, Sumukh [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] bug in rank(), order(), is.unsorted() on character vector
Hi, This looks OK: > x <- c("_1_", "1_9", "2_9") > rank(x) [1] 1 2 3 But this does not: > xa <- paste(x, "a", sep="") > xa [1] "_1_a" "1_9a" "2_9a" > rank(xa) [1] 2 1 3 Cheers, H. > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.14.0 -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
2011/12/7 Hervé Pagès : > rank(xa) See help(Comparison), specifically: "Beware of making _any_ assumptions about the collation order" followed by "Collation of non-letters (spaces, punctuation signs, hyphens, fractions and so on) is even more problematic." Barry __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
@Barry : regardless of whether '_' comes before or after '1' , it should be consistent. Adding an 'a' shouldn't shift '_' from before '1' to between '1' and '2', that's clearly an error. The help files are not stating anything about that. The only thing I can imagine, is that '_' gets ignored (in that case 19a would rank before 1a). This said, I can't reproduce. > x <- c("_1_", "1_9", "2_9") > xa <- paste(x,'a',sep='') > rank(x) [1] 1 2 3 > rank(xa) [1] 1 2 3 > sessionInfo() R version 2.14.0 Patched (2006-00-00 r0) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils methods base other attached packages: [1] svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.36-9 loaded via a namespace (and not attached): [1] cluster_1.14.1 grid_2.14.0 lattice_0.19-33 svMisc_0.9-63 tools_2.14.0 2011/12/7 Hervé Pagès : > Hi, > > This looks OK: > >> x <- c("_1_", "1_9", "2_9") >> rank(x) > [1] 1 2 3 > > But this does not: > >> xa <- paste(x, "a", sep="") >> xa > [1] "_1_a" "1_9a" "2_9a" >> rank(xa) > [1] 2 1 3 > > Cheers, > H. > >> sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 > [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.14.0 > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/12/11 15:48, Joris Meys wrote: > @Barry : regardless of whether '_' comes before or after '1' , it > should be consistent. Adding an 'a' shouldn't shift '_' from > before '1' to between '1' and '2', that's clearly an error. The > help files are not stating anything about that. The only thing I > can imagine, is that '_' gets ignored (in that case 19a would rank > before 1a). > > This said, I can't reproduce. I can: > x <- c("_1_", "1_9", "2_9") xa <- paste(x,'a',sep='') rank(x) [1] 1 2 3 > rank(xa) [1] 2 1 3 > sessionInfo() R version 2.14.0 (2011-10-31) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base > version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 14.0 year 2011 month 10 day31 svn rev57496 language R version.string R version 2.14.0 (2011-10-31) > Interesting. Rainer > >> x <- c("_1_", "1_9", "2_9") xa <- paste(x,'a',sep='') rank(x) > [1] 1 2 3 >> rank(xa) > [1] 1 2 3 > >> sessionInfo() > R version 2.14.0 Patched (2006-00-00 r0) Platform: > i386-pc-mingw32/i386 (32-bit) > > locale: [1] LC_COLLATE=English_United States.1252 > LC_CTYPE=English_United States.1252LC_MONETARY=English_United > States.1252 [4] LC_NUMERIC=C > LC_TIME=English_United States.1252 > > attached base packages: [1] grDevices datasets splines graphics > stats tcltk utils methods base > > other attached packages: [1] svSocket_0.9-51 TinnR_1.0.3 > R2HTML_2.2 Hmisc_3.8-3 survival_2.36-9 > > loaded via a namespace (and not attached): [1] cluster_1.14.1 > grid_2.14.0 lattice_0.19-33 svMisc_0.9-63 tools_2.14.0 > > > 2011/12/7 Hervé Pagès : >> Hi, >> >> This looks OK: >> >>> x <- c("_1_", "1_9", "2_9") rank(x) >> [1] 1 2 3 >> >> But this does not: >> >>> xa <- paste(x, "a", sep="") xa >> [1] "_1_a" "1_9a" "2_9a" >>> rank(xa) >> [1] 2 1 3 >> >> Cheers, H. >> >>> sessionInfo() >> R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu >> (64-bit) >> >> locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] >> LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8 [5] >> LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C >> LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] >> LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: [1] stats graphics grDevices utils >> datasets methods base >> >> loaded via a namespace (and not attached): [1] tools_2.14.0 >> >> >> -- Hervé Pagès >> >> Program in Computational Biology Division of Public Health >> Sciences Fred Hutchinson Cancer Research Center 1100 Fairview >> Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 >> >> E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) >> 667-1319 >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax : +33 - (0)9 58 10 27 44 Fax (D):+49 - (0)3 21 21 25 22 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7fgQMACgkQoYgNqgF2egrjvACffUhSUEriYGSQY8MstwVbvAj6 +w8An1FrwX0YXqDUqDoRq/zW31FW7WOj =zQr1 -END PGP SIGNATURE- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
I'm not an expert on Locales but those that are getting this behavior and those that aren't appear to be different. (in fact, all three sets are slightly different). Isn't sorting order based on Locale rather than any internal R code anyway? ~G On Wed, Dec 7, 2011 at 7:06 AM, Rainer M Krug wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 07/12/11 15:48, Joris Meys wrote: > > @Barry : regardless of whether '_' comes before or after '1' , it > > should be consistent. Adding an 'a' shouldn't shift '_' from > > before '1' to between '1' and '2', that's clearly an error. The > > help files are not stating anything about that. The only thing I > > can imagine, is that '_' gets ignored (in that case 19a would rank > > before 1a). > > > > This said, I can't reproduce. > > I can: > > > x <- c("_1_", "1_9", "2_9") xa <- paste(x,'a',sep='') rank(x) > [1] 1 2 3 > > rank(xa) > [1] 2 1 3 > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: i686-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > version > _ > platform i686-pc-linux-gnu > arch i686 > os linux-gnu > system i686, linux-gnu > status > major 2 > minor 14.0 > year 2011 > month 10 > day31 > svn rev57496 > language R > version.string R version 2.14.0 (2011-10-31) > > > > Interesting. > > Rainer > > > > > >> x <- c("_1_", "1_9", "2_9") xa <- paste(x,'a',sep='') rank(x) > > [1] 1 2 3 > >> rank(xa) > > [1] 1 2 3 > > > >> sessionInfo() > > R version 2.14.0 Patched (2006-00-00 r0) Platform: > > i386-pc-mingw32/i386 (32-bit) > > > > locale: [1] LC_COLLATE=English_United States.1252 > > LC_CTYPE=English_United States.1252LC_MONETARY=English_United > > States.1252 [4] LC_NUMERIC=C > > LC_TIME=English_United States.1252 > > > > attached base packages: [1] grDevices datasets splines graphics > > stats tcltk utils methods base > > > > other attached packages: [1] svSocket_0.9-51 TinnR_1.0.3 > > R2HTML_2.2 Hmisc_3.8-3 survival_2.36-9 > > > > loaded via a namespace (and not attached): [1] cluster_1.14.1 > > grid_2.14.0 lattice_0.19-33 svMisc_0.9-63 tools_2.14.0 > > > > > > 2011/12/7 Hervé Pagès : > >> Hi, > >> > >> This looks OK: > >> > >>> x <- c("_1_", "1_9", "2_9") rank(x) > >> [1] 1 2 3 > >> > >> But this does not: > >> > >>> xa <- paste(x, "a", sep="") xa > >> [1] "_1_a" "1_9a" "2_9a" > >>> rank(xa) > >> [1] 2 1 3 > >> > >> Cheers, H. > >> > >>> sessionInfo() > >> R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu > >> (64-bit) > >> > >> locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] > >> LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8 [5] > >> LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C > >> LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] > >> LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C > >> > >> attached base packages: [1] stats graphics grDevices utils > >> datasets methods base > >> > >> loaded via a namespace (and not attached): [1] tools_2.14.0 > >> > >> > >> -- Hervé Pagès > >> > >> Program in Computational Biology Division of Public Health > >> Sciences Fred Hutchinson Cancer Research Center 1100 Fairview > >> Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 > >> > >> E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) > >> 667-1319 > >> > >> __ > >> R-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > > > - -- > Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation > Biology, UCT), Dipl. Phys. (Germany) > > Centre of Excellence for Invasion Biology > Stellenbosch University > South Africa > > Tel : +33 - (0)9 53 10 27 44 > Cell: +33 - (0)6 85 62 59 98 > Fax : +33 - (0)9 58 10 27 44 > > Fax (D):+49 - (0)3 21 21 25 22 44 > > email: rai...@krugs.de > > Skype: RMkrug > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk7fgQMACgkQoYgNqgF2egrjvACffUhSUEriYGSQY8MstwVbvAj6 > +w8An1FrwX0YXqDUqDoRq/zW31FW7WOj > =zQr1 > -END PGP SIGNATURE- > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list
[Rd] Possible bug in 'new()' for Reference Classes
Dear list, I think I stumbled across a little bug with respect to the standard initialization routine for Reference Classes. It seems that a field 'self' is treated as if it's name would be '.self' (which we know is reserved for the self reference of the instantiated object itself) and thus an error is thrown. If the field value is assigned in an explicit call after the instantiation via 'new()', everything works just fine: setRefClass("ClassInfo", fields=list( self="character", super="character", sub="character" ) ) new("ClassInfo", self="B", super="A", sub="C")# Error x <- new("ClassInfo", super="A", sub="C") x x$self <- "B" # Works x Best regards, Janko __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
Do this first and try again. R> Sys.setlocale("LC_COLLATE", "C") On 12/7/11 3:41 AM, "Hervé Pagès" wrote: > Hi, > > This looks OK: > >> x <- c("_1_", "1_9", "2_9") >> rank(x) > [1] 1 2 3 > > But this does not: > >> xa <- paste(x, "a", sep="") >> xa > [1] "_1_a" "1_9a" "2_9a" >> rank(xa) > [1] 2 1 3 > > Cheers, > H. > >> sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8 > [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.14.0 > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
On Dec 7, 2011, at 15:48 , Joris Meys wrote: > @Barry : regardless of whether '_' comes before or after '1' , it > should be consistent. Adding an 'a' shouldn't shift '_' from before > '1' to between '1' and '2', that's clearly an error. The help files > are not stating anything about that. The only thing I can imagine, is > that '_' gets ignored (in that case 19a would rank before 1a). As far as I remember, that is exactly the case. In some locales, and not even consistently across different OS versions of the "same" locale, there are characters that are ignored for collation. With that in mind, what we see is really not any stranger than "a" < "ab" but "ac" > "abc". R just uses what the OS supplies, so if you want to use words like "inconsistent" or "error", please direct them at those who define the locales. (And be prepared to realize that you may have kicked a hornet's nest...) > > This said, I can't reproduce. > >> x <- c("_1_", "1_9", "2_9") >> xa <- paste(x,'a',sep='') >> rank(x) > [1] 1 2 3 >> rank(xa) > [1] 1 2 3 > >> sessionInfo() > R version 2.14.0 Patched (2006-00-00 r0) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United > States.1252 > > attached base packages: > [1] grDevices datasets splines graphics stats tcltk utils > methods base > > other attached packages: > [1] svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 > survival_2.36-9 > > loaded via a namespace (and not attached): > [1] cluster_1.14.1 grid_2.14.0 lattice_0.19-33 svMisc_0.9-63 > tools_2.14.0 > > > 2011/12/7 Hervé Pagès : >> Hi, >> >> This looks OK: >> >>> x <- c("_1_", "1_9", "2_9") >>> rank(x) >> [1] 1 2 3 >> >> But this does not: >> >>> xa <- paste(x, "a", sep="") >>> xa >> [1] "_1_a" "1_9a" "2_9a" >>> rank(xa) >> [1] 2 1 3 >> >> Cheers, >> H. >> >>> sessionInfo() >> R version 2.14.0 (2011-10-31) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8 >> [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_CA.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] tools_2.14.0 >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpa...@fhcrc.org >> Phone: (206) 667-5791 >> Fax:(206) 667-1319 >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > Joris Meys > Statistical consultant > > Ghent University > Faculty of Bioscience Engineering > Department of Mathematical Modelling, Statistics and Bio-Informatics > > tel : +32 9 264 59 87 > joris.m...@ugent.be > --- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
2011/12/7 Joris Meys : > @Barry : regardless of whether '_' comes before or after '1' , it > should be consistent. Adding an 'a' shouldn't shift '_' from before > '1' to between '1' and '2', that's clearly an error. The help files > are not stating anything about that. That's an assumption. The help pages are quite clear about making assumptions. The only way this could be a 'bug' is if you can show that the sort order in R is different from the lexicographic sort order using the collating sequence of the locale in use. But even my command line 'sort' agrees: $ sort < f1.txt _1_ 1_9 2_9 now add the trailing a: $ sort < f1.txt 1_9a _1_a 2_9a [ I had a thought maybe it was because _ is sometimes used to break thousands in numeric formats, but I can't get any obvious consistency out of that hypothesis ] Barry __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug in rank(), order(), is.unsorted() on character vector
2011/12/7 Barry Rowlingson : > 2011/12/7 Joris Meys : >> @Barry : regardless of whether '_' comes before or after '1' , it >> should be consistent. Adding an 'a' shouldn't shift '_' from before >> '1' to between '1' and '2', that's clearly an error. The help files >> are not stating anything about that. > > That's an assumption. The help pages are quite clear about making > assumptions. > I used the word 'error' too quickly. Translate 'error' into 'unexpected behaviour'. I also see now that assuming all characters are actually used is an assumption one shouldn't make. But that's not what I understood from the help text and the examples therein. Thanks for the clarification. I sincerely hope though that I can assume the sort order, using the same locale, is always going to be the same. Otherwise order(x) starts to look scaringly close to sample(seq_len(x))... Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Possible bug in 'new()' for Reference Classes
Right, thanks for the catch. Actually, field names "s", "se", "sel" would also produce the bug. Partial matching of argument names bites again. This should be fixed in r-devel and 2.14 patched, as of SVN rev. 57842. Do try to follow the API in the documentation and use generator objects for reference classes. It's simpler than using S4 new() and makes it clear that the example is of a reference class. John On 12/7/11 7:36 AM, Janko Thyson wrote: Dear list, I think I stumbled across a little bug with respect to the standard initialization routine for Reference Classes. It seems that a field 'self' is treated as if it's name would be '.self' (which we know is reserved for the self reference of the instantiated object itself) and thus an error is thrown. If the field value is assigned in an explicit call after the instantiation via 'new()', everything works just fine: setRefClass("ClassInfo", fields=list( self="character", super="character", sub="character" ) ) new("ClassInfo", self="B", super="A", sub="C")# Error x <- new("ClassInfo", super="A", sub="C") x x$self <- "B" # Works x Best regards, Janko __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RcppArmadillo compilation error: R CMD SHLIB returns status 1
On Dec 6, 2011 8:30 AM, "Duncan Murdoch" wrote: > > On 05/12/2011 1:22 PM, Paul Viefers wrote: >> >> Dear all, >> >> running the example by D. Eddebuettel ( http://dirk.eddelbuettel.com/blog/2011/04/23/) I get an error message. Specifically, the R code I was taking from the above example is >> >> ### BEGIN EXAMPLE ### >> >> suppressMessages(require(RcppArmadillo)) >> suppressMessages(require(Rcpp)) >> suppressMessages(require(inline)) >> code<- ' >>arma::mat coeff = Rcpp::as(a); >>arma::mat errors = Rcpp::as(e); >>int m = errors.n_rows; int n = errors.n_cols; >>arma::mat simdata(m,n); >>simdata.row(0) = arma::zeros(1,n); >>for (int row=1; row> simdata.row(row) = simdata.row(row-1)*trans(coeff)+errors.row(row); >>} >>return Rcpp::wrap(simdata); >> ' >> ## create the compiled function >> rcppSim<- cxxfunction(signature(a="numeric",e="numeric"), >> code,plugin="RcppArmadillo") >> >> ### END OF EXAMPLE ### >> >> Executing this inside R, returned the following: >> >> ERROR(s) during compilation: source code errors or compiler configuration errors! >> >> Program source: >> 1: >> 2: // includes from the plugin >> 3: #include >> 4: #include >> 5: >> 6: >> 7: #ifndef BEGIN_RCPP >> 8: #define BEGIN_RCPP >> 9: #endif >> 10: >> 11: #ifndef END_RCPP >> 12: #define END_RCPP >> 13: #endif >> 14: >> 15: using namespace Rcpp; >> 16: >> 17: >> 18: // user includes >> 19: >> 20: >> 21: // declarations >> 22: extern "C" { >> 23: SEXP file33765791( SEXP a, SEXP e) ; >> 24: } >> 25: >> 26: // definition >> 27: >> 28: SEXP file33765791( SEXP a, SEXP e ){ >> 29: BEGIN_RCPP >> 30: >> 31:arma::mat coeff = Rcpp::as(a); >> 32:arma::mat errors = Rcpp::as(e); >> 33:int m = errors.n_rows; int n = errors.n_cols; >> 34:arma::mat simdata(m,n); >> 35:simdata.row(0) = arma::zeros(1,n); >> 36:for (int row=1; row> 37: simdata.row(row) = simdata.row(row-1)*trans(coeff)+errors.row(row); >> 38:} >> 39:return Rcpp::wrap(simdata); >> 40: >> 41: END_RCPP >> 42: } >> 43: >> 44: >> Error in compileCode(f, code, language = language, verbose = verbose) : >> Compilation ERROR, function(s)/method(s) not created! >> Executing command 'C:/PROGRA~1/R/R-214~1.0/bin/i386/R CMD SHLIB file33765791.cpp 2> file33765791.cpp.err.txt' returned status 1 >> >> I am working under R 2.14.0 and as the pros among you might guess, I am new to using the C++ interfaces within R. I think all I have to do is to edit some settings on my Windows 7 machine here, but the error message is too cryptic to me. Alas, I could also not find any thread or help topic that deals with this online. I appreciate any direct reply or reference where I can find a solution to this. >> Please let me know in case I am leaving out some essential details here. > > > If you put the program source into a file (e.g. fn.cpp) and in a Windows cmd shell you run > > R CMD SHLIB fn.cpp > > what do you get? I would guess you've got a problem with your setup of the compiler or other tools, and this would likely show it. I don't think that will work because you need the appropriate -I option to get the headers from the RcppArmadillo package. It may be easier to use the RcppArmadillo.package.skeleton function to create a package. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel