Re: [Rd] sweep sanity checking?
Petr Savicky kindly brought this thread to my attention as I'm afraid it had passed me by. As one of the contributors to the earlier discussion on adding warnings to sweep I would like to give my support to Petr's proposed patch. For the record I should say that Petr was right to point out that the use of MARGIN in my examples did not make sense https://stat.ethz.ch/pipermail/r-devel/2007-July/046487.html so I have no quibble with that. I think it is sensible too, to use the dim attribute of STATS as the basis of the test, when the dim attribute is present. This provides a way to control the strength of the test in the case of sweeping out a vector, as Petr describes in his message below. I think that the proposed patch successfully brings together the different views on what should be tested, which was the stumbling block last time around https://stat.ethz.ch/pipermail/r-help/2005-June/074037.html Even if people set check.margin = FALSE for reasons of speed, this in itself should be a useful check, since they will need to be confident that the test is unnecessary. Heather -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Petr Savicky Sent: 08 August 2007 07:54 To: r-devel@r-project.org Subject: Re: [Rd] sweep sanity checking? Thanks to Martin Maechler for his comments, advice and for pointing out the speed problem. Thanks also to Ben Bolker for tests of speed, which confirm that for small arrays, a slow down by a factor of about 1.2 - 1.5 may occur. Now, I would like to present a new version of sweep, which is simpler and has an option to avoid the test. This is expected to be used in scripts, where the programmer is quite sure that the usage is correct and speed is required. The new version differs from the previous one in the following: 1. The option check.margin has a different meaning. It defaults to TRUE and it determines whether the test is performed or not. 2. Since check.margin has the meaning above, it cannot be used to select, which test should be performed. This depends on the type of STATS. The suggested sweep function contains two tests: - a vector test by Heather Turner, which is used, if STATS has no dim attribute and, hence, is a vector (STATS should not be anything else than a vector or an array) - an array test used if STATS has dim attribute. The vector test allows some kinds of recycling, while the array test does not. Hence, in the most common case, where x is a matrix and STATS is a vector, if the user likes to be warned if the length of the vector is not exactly the right one, the following call is suggested: sweep(x,MARGIN,as.array(STATS)). Otherwise, a warning will be generated only if length(STATS) does not divide the specified dimension of x, which is nrow(x) (MARGIN=1) or ncol(x) (MARGIN=2). 3. If STATS is an array, then the test is more restrictive than in the previous version. It is now required that after deleting dimensions with one level, the remaining dimensions coincide. The previous version allowed additionally the cases, when dim(STATS) is a prefix of dim(x)[MARGIN], for example, if dim(STATS) = k1 and dim(x)[MARGIN] = c(k1,k2). The code of the tests in the suggested sweep is based on the previous suggestions https://stat.ethz.ch/pipermail/r-help/2005-June/073989.html by Robin Hankin https://stat.ethz.ch/pipermail/r-help/2005-June/074001.html by Heather Turner https://stat.ethz.ch/pipermail/r-devel/2007-June/046217.html by Ben Bolker with some further modifications. The modification of sweep.Rd was prepared by Ben Bolker and me. I would like to encourage everybody who likes to express his opinion on the patch to do it now. In my opinion, the suggestion of the new code stabilized in the sense that I will not modify it unless there is a negative feedback. A patch against R-devel_2007-08-06 is attached. It contains tabs. If they are corrupted by email transfer, use the link http://www.cs.cas.cz/~savicky/R-devel/patch-sweep which is an identical copy. Petr Savicky. --- R-devel_2007-08-06/src/library/base/R/sweep.R 2007-07-27 17:51:13.0 +0200 +++ R-devel_2007-08-06-sweep/src/library/base/R/sweep.R 2007-08-07 10:30:12.383672960 +0200 @@ -14,10 +14,29 @@ # A copy of the GNU General Public License is available at # http://www.r-project.org/Licenses/ -sweep <- function(x, MARGIN, STATS, FUN = "-", ...) +sweep <- function(x, MARGIN, STATS, FUN = "-", check.margin=TRUE, ...) { FUN <- match.fun(FUN) dims <- dim(x) + if (check.margin) { + dimmargin <- dims[MARGIN] + dimstats <- dim(STATS) + lstats <- length(STATS) + if (lstats > prod(dimmargin)) { + warning("length of STATS greater than the extent of dim(x)[MARGIN]") + } else if (is.null(d
[Rd] paste() with NAs .. change worth persuing?
Consider this example code c1 <- letters[1:7]; c2 <- LETTERS[1:7] c1[2] <- c2[3:4] <- NA rbind(c1,c2) ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## c1 "a" NA "c" "d" "e" "f" "g" ## c2 "A" "B" NA NA "E" "F" "G" paste(c1,c2) ## -> [1] "a A" "NA B" "c NA" "d NA" "e E" "f F" "g G" where a more logical result would have entries 2:4 equal to NA i.e., as.character(NA) akaNA_character_ Is this worth persuing, or does anyone see why not? Regards, Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] paste() with NAs .. change worth persuing?
On 8/22/2007 11:50 AM, Martin Maechler wrote: > Consider this example code > > c1 <- letters[1:7]; c2 <- LETTERS[1:7] > c1[2] <- c2[3:4] <- NA > rbind(c1,c2) > > ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] > ## c1 "a" NA "c" "d" "e" "f" "g" > ## c2 "A" "B" NA NA "E" "F" "G" > > paste(c1,c2) > > ## -> [1] "a A" "NA B" "c NA" "d NA" "e E" "f F" "g G" > > where a more logical result would have entries 2:4 equal to > NA > i.e., as.character(NA) > akaNA_character_ > > Is this worth persuing, or does anyone see why not? A fairly common use of paste is to put together reports for human consumption. Currently we have > p <- as.character(NA) > paste("the value of p is", p) [1] "the value of p is NA" which looks reasonable. Would this become > p <- as.character(NA) > paste("the value of p is", p) [1] NA under your proposal? (In a quick search I was unable to find a real example where this would happen, but it would worry me...) Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] compiling R under cygwin
>> For various reasons, >I think it is only courteous to mention some good reasons if you want to take up people's time. Some of the reasons we would like a cygwin version aren't necessarily good reasons. We have been using cygwin for sometime, mostly to deal with scripting in a combined windows/unix environment. We have a setup which allows windows users to run many scripts in the same way as unix users. These scripts are often python or shell scripts. We have R installed on the unix machines, and the system administrators would like to be able to have R on windows in the same environment. This set up also means that the administrator can fairly easily maintain the version of software used on all user's machines. Probably this could all be managed and still use the native windows version of R, but the administrator is familiar with cygwin and they could manage this software in the same way they manage other packages. We would like to be able to use linux machines on pc's but unfortunately we have restrictions imposed on us that prevent this. This restriction also goes as far as the use of virtual machines. My personal preference would be to run linux on my work pc, and use a virtual machine to run windows software, such as ArcGIS and Imagine, that are not available for linux. This does not seem to be an option for us. One thing I was interested in was knowing if there are others who also would like a cygwin version. From the replies to my post, and from a search of the mailing list archive, I think that there is little demand for this. We would, however, be prepared to help in some way for the few people who are interested. Robert Denham Environmental Statistician Remote Sensing Centre Telephone 07 3896 9899 www.nrw.qld.gov.au Department of Natural Resources & Water QScape Building, 80 Meiers Road, Indooroopilly Qld 4068 -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Tuesday, 21 August 2007 9:53 PM To: Duncan Murdoch Cc: Denham Robert; r-devel@r-project.org Subject: Re: [Rd] compiling R under cygwin Yes, > What is the advantage of building this? was my question too. If you want a Unix-like version of R on PC hardware running Windows why not run a Unix-like OS under a virtual machine? Quite a lot of the details are wrong: using FLIBS, BLAS_LIBS and LIBS as intended will solve most of the problems. I would use --disable-nls --disable-mbcs as you don't need them (and in particular you don't benefit from MBCS support on Windows unless you are in a CJK locale). Note that 2.5.1 is released and there is unlikely to be a 2.5.2, so any changes would be made only to R-devel. It there is a convincing case to tailor a build for Cygwin there we can probably do so rather easily, but the need for ongoing support would be a worry. (If platforms are not used and in particular not tested in the alpha/beta testing phases then the ability to build on them crumbles away. We seems to be down to regular testers on Linux, Windows, MacOS X, Solaris and FreeBSD, with some help on AIX after a patch with none.) On Tue, 21 Aug 2007, Duncan Murdoch wrote: > Denham Robert wrote: >> For various reasons, I think it is only courteous to mention some good reasons if you want to take up people's time. >> it suits our workplace to have a cygwin version of R. I am pretty >> sure that cygwin is still not a supported environment for R, but we >> have managed to compile R-2.5.1 under cygwin without too many dramas. >> Our procedure is described below. We still have a few problems >> compiling libraries without manually changing files from .so to .dll, >> but it seems ok. >> > I would expect other subtle problems as well, because Cygwin is not a > normal Unix. I don't know whether any of these differences matter to > R, but some things to look out for are: > > - you can't unlink a file while it is open > - filenames are not case sensitive > - file permissions have strange defaults (everything is executable) > - I think the executable format still needs to be Windows format > - There's no such thing as a ptty > - You'll probably need X11 for graphics, and will lose support for > Windows metafile output (wmf) >> >> I was wondering whether this information is likely to be useful to >> others, and if we should spend any time looking in to ways in which >> the configure/build/install code could be modified to allow a >> standard install. >> > What is the advantage of building this? I don't think we want to > support platforms just for the sake of supporting more platforms, but > if there's a real need for it, that would be different. > > Duncan Murdoch >> >> Notes on building R under cygwin: >> >> export FFLAGS=-O3 >> export CFLAGS=-O3 >> export CXXFLAGS=-O3 >> export OBJCFLAGS=-O3 >> export FCFLAGS=-O3 >> export LDFLAGS='-lblas -lg2c -lintl' >> >> export R_OSTYPE=unix >> >> ./configure --prefix=/opt/freeware/R/R-2.5.1 \ >> --with-tcl-config=/usr/lib/tclConfi
Re: [Rd] compiling R under cygwin
On Thu, 23 Aug 2007, Denham Robert wrote: >>> For various reasons, >> I think it is only courteous to mention some good reasons if you want > to take up people's time. > > Some of the reasons we would like a cygwin version aren't necessarily > good reasons. We have been using cygwin for sometime, mostly to deal > with scripting in a combined windows/unix environment. We have a setup > which allows windows users to run many scripts in the same way as unix > users. These scripts are often python or shell scripts. We have R > installed on the unix machines, and the system administrators would like > to be able to have R on windows in the same environment. This set up > also means that the administrator can fairly easily maintain the version > of software used on all user's machines. Probably this could all be > managed and still use the native windows version of R, but the > administrator is familiar with cygwin and they could manage this > software in the same way they manage other packages. Yes, it could almost certainly be done with Rterm.exe. The issue I came across was the so-called 'posix file paths' that Cygwin uses. Most (but not all) Windows programs accept file paths with / as the path separator, and most (but not all, e.g. tar) Cygwin programs accept paths of the forn c:/path/to/file. So provided you use that as your format, interworking with Unix and Unix-like shells work fine. It used to be the case that if you had just one drive C: then Cygwin programs produced paths of the form /path/to/file that also worked on Windows. Now they produce /cygdrive/c/path/to/file that works nowhere else. In general this is a minor nuisance, but I needed to be able to cross-build R in an environment where I only have Cygwin-based cross-compilers, and there the path issues bit me: I needed a version of R that accepted and returned Cygwin-style paths. So I made the configure changes necessary to build R under Cygwin, and had it running in 20 mins. > We would like to be able to use linux machines on pc's but unfortunately > we have restrictions imposed on us that prevent this. This restriction > also goes as far as the use of virtual machines. My personal preference > would be to run linux on my work pc, and use a virtual machine to run > windows software, such as ArcGIS and Imagine, that are not available for > linux. This does not seem to be an option for us. > > One thing I was interested in was knowing if there are others who also > would like a cygwin version. From the replies to my post, and from a > search of the mailing list archive, I think that there is little demand > for this. We would, however, be prepared to help in some way for the > few people who are interested. As I said earlier, it builds out of the box in R-devel (with suitable options documented in the R-admin manual). No guarantees that it will continue to do so unless tested in the alpha/beta phase though. As no other platform we use nowadays requires that shared objects/dynamic libraries have all imports satisfied at build time, this is liable to get broken. But I would encourage people to use Rterm.exe if it can be made to do what you need. > > > > Robert Denham > Environmental Statistician > Remote Sensing Centre > Telephone 07 3896 9899 > www.nrw.qld.gov.au > > Department of Natural Resources & Water > QScape Building, 80 Meiers Road, Indooroopilly Qld 4068 > > -Original Message- > From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > Sent: Tuesday, 21 August 2007 9:53 PM > To: Duncan Murdoch > Cc: Denham Robert; r-devel@r-project.org > Subject: Re: [Rd] compiling R under cygwin > > Yes, > >> What is the advantage of building this? > > was my question too. If you want a Unix-like version of R on PC > hardware running Windows why not run a Unix-like OS under a virtual > machine? > > Quite a lot of the details are wrong: using FLIBS, BLAS_LIBS and LIBS as > intended will solve most of the problems. I would use --disable-nls > --disable-mbcs as you don't need them (and in particular you don't > benefit from MBCS support on Windows unless you are in a CJK locale). > > Note that 2.5.1 is released and there is unlikely to be a 2.5.2, so any > changes would be made only to R-devel. It there is a convincing case to > tailor a build for Cygwin there we can probably do so rather easily, but > the need for ongoing support would be a worry. > > (If platforms are not used and in particular not tested in the > alpha/beta testing phases then the ability to build on them crumbles > away. We seems to be down to regular testers on Linux, Windows, MacOS > X, Solaris and FreeBSD, with some help on AIX after a patch with none.) > > On Tue, 21 Aug 2007, Duncan Murdoch wrote: > >> Denham Robert wrote: >>> For various reasons, > > I think it is only courteous to mention some good reasons if you want to > take up people's time. > >>> it suits our workplace to have a cygwin version of R. I am pretty >>> sure that