Re: [Rd] Suggestion: mkString(NULL) should be NA
> Gabriel Becker > on Tue, 24 May 2016 10:30:48 -0700 writes: > On Tue, May 24, 2016 at 9:30 AM, Jeroen Ooms > wrote: >> On Tue, May 24, 2016 at 5:59 PM, Gabriel Becker >> wrote: >> > Shouldn't Rf_mkString(NULL) return (the c-level equivalent of) >> character() >> > rather than the NA_character_? >> >> No. It should still be safe to assume that mkString() always returns a >> character vector of exactly length one. Anything else could lead to >> type errors. >> > Well the thing is you're passing an invalid pointer, that doesn't point to > a C string, to a constructor expecting a valid const char *. I'm fine with > the contract being that mkString always returns a character vector of > length one, but that doesn't necessarily mean that the function needs to > accept NULL pointers. The contract as I understand it is that if you give > it a C string, it will create a CHARSXP for that string. In this light, > Bill's suggestion that it throw an error seems the most principled > response. I would think you would need to at the very least emit a warning. I agree with Jerooen that mkChar() and mkString() may be used in contexts where they can end up with a NULL and hence should not segfault... and hence am willing the extra (very small) penalty of checking for NULL. >> >> > An empty string and NULL aren't the same. >> >> Exactly! So if you pass in an empty C string, you get an empty R >> string, and if you pass in a null pointer you get NA. >> >> Rf_mkString(NULL) <--> NA >> Rf_mkString("") <--> "" >> >> There is no ambiguity, and much better than segfaulting. Better than segfaulting, yes, but really agree with Bill (and Gabe), also for Rf_mkChar(NULL): I think both functions should give an error in such a case rather than returning NA_character_ It is an accident of some kind if they got NULL, no? -- Martin Maechler, ETH Zurich > Well, better than segfaulting is not really relevant here. No one is > arguing that it should segfault. The question is what behavior it should > have when it doesn't segfault. > It's true that a C empty string is not the same as NULL, but NULL isn't the > same as NA either. Semantically, for your use-case (which I gather arose > from interactions we had :) ) the NULL means there is no version, while NA > indicates there is a version but we don't know what it is. Imagine an > object class that represents a persons name (first, middle, last). Now take > two people, One has no middle name (and we know that when creating the > object) and another for whom we don't have any information about the middle > name, only first and last were reported. I would expect the first one to > have middle name either NULL or (in a data.frame context) "", while the > second would have NA_character_. In this light, mkString should arguably > generate "". i don't think the fact that there is another way to get "" is > a particularly large problem. > On the other hand, and in support of your position it came up as Michael > Lawrence and I were talking about this that asChar from utils.c will give > you NA_STRING when you give it R_NilValue. That is a coercion though, > whereas arguably mkString is not. That said, consistency would probably be > good. > ~G > -- > Gabriel Becker, PhD > Associate Scientist (Bioinformatics) > Genentech Research > [[alternative HTML version deleted]] > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] configure / make problems with R-devel
> > on Tue, 24 May 2016 15:15:17 -0700 writes: > Thank you, Martin. I linked to your message in a comment here so maybe > other people will know about that useful technique: > http://singmann.org/installing-r-devel-on-linux/#comment-161 > However, when I try it, I get an error: > $ make > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/m4' > make[1]: Nothing to be done for 'R'. > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/m4' > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/tools' > make[1]: Nothing to be done for 'R'. > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/tools' > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/doc' > /usr/bin/install: cannot stat '../../R-svn/doc/NEWS': No such file or directory > /usr/bin/install: cannot stat '../../R-svn/doc/NEWS.pdf': No such file or directory > Makefile:164: recipe for target 'svnonly' failed > make[1]: *** [svnonly] Error 1 > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/doc' > Makefile:60: recipe for target 'R' failed > make: *** [R] Error 1 This is strange: Did you accidentally delete the 'non-tarball' file in your build directory which should have been created by 'configure' ? I ask because your 'make' seems to be using the 'else' clause in the 'svnonly' target in the R-svn-build/Makefile but it should really use the first branch which does install things in ./doc/ (such as NEWS or NEWS.pdf). I have never used 'STRIP=true' -- maybe that did remove the 'non-tarball' file ? Why not rather do it the way I told you, i.e., *with* recommended packages, and no arguments to 'configure' (if that does work, you may try variants.. I agree that --without-recommended-packages should work as well, I just never use that). Martin > I configured like this: > $ cd ../R-svn-build/ > $ ../R-svn/configure --without-recommended-packages --prefix=$HOME/r-svn-test STRIP=true > I guess I can try to debug it myself but thought I should report back > to you. It works when I 'configure' and 'make' in the source > directory. > Cheers, > Frederick > On Tue, May 24, 2016 at 07:20:18PM +0200, Martin Maechler wrote: >> > Keith O'Hara >> > on Tue, 24 May 2016 12:47:43 -0400 writes: >> >> > svn checkout https://svn.r-project.org/R/trunk/ >> >> yes, indeed. thank you, Keith. >> >> and from then on only >> >> cd >> svn up >> >> (which is short for 'svn update'). >> >> Another hint: Then do *not* build in the source directory but in >> what we called a "build directory"; i.e., something like >> (from scratch; including the only-once needed "checkout") : >> >> svn checkout https://svn.r-project.org/R/trunk/ R >> cd R >> tools/rsync-recommended >> mkdir ../build-R >> cd ../build-R >> ../R/configure >> make >> make check >> >> and I then never run 'make install', but rather use symbolic >> link from >> /build-R/bin/R to something like ~/bin/R-devel >> i.e., >> cd ~/bin >> ln -s /build-R/bin/R R-devel >> >> Martin >> >> >> On May 24, 2016, at 12:45 PM, frede...@ofb.net wrote: >> >> >> >> I agree with Martin's summary of the situation, and with the updated >> >> NEWS entry. >> >> >> >> I'm not familiar with Subversion, can you tell me the command to use? >> >> >> >> (I tried "svn co https://svn.r-project.org/R/"; but it seems to be >> >> downloading all branches) >> >> >> >> Frederick >> >> >> >> On Tue, May 24, 2016 at 04:30:11PM +0200, Martin Maechler wrote: >> peter dalgaard >> on Tue, 24 May 2016 13:47:27 +0200 writes: >> >>> >> I had a regression in config.site so the nightly build didn't. Retrying >> Looks like it will build, but the ctl-R, ctl-C bug is still present on OSX (w/Simon's libs). This _was_ fixed for a while, was it not? >> >>> >> >>> I thought it was never fixed, for readline versions 5.x (or all >> >>> of readline_version < 6.3 ?) because the patch assumed features >> >>> not available, e.g., for Frederik (who got compilation errors >> >>> which I think you confirmed on pre-6 readline). >> >>> >> >>> I remember you having two different readlines installed on OSX >> >>> but the standard Mac binary (from CRAN, i.e. Simon) would use >> >>> the old readline version ? >> >>> >> >>> so that whole resetReadline() solution is now conditionalized inside >> >>> >> >>> #if defined(RL_READLINE_VERSION) && RL_READLINE_VERSION >= 0x0603 >> >>> ... >> >>> ... >> >>> #endif >> >>> >> >>> and hence the previous code (which is buggy) is us
Re: [Rd] Suggestion: mkString(NULL) should be NA
On Wed, May 25, 2016 at 12:31 PM, Martin Maechler wrote: > Better than segfaulting, yes, but really agree with Bill (and > Gabe), also for Rf_mkChar(NULL): > I think both functions should give an error in such a case > rather than returning NA_character_ > > It is an accident of some kind if they got NULL, no? Not necessarily. A char* of NULL can be a string which is not initiated or simply unavailable due to configuration. The example from my original email was in curl package which exposes the version string of libz that was used to build libcurl: mkString(data->libz_version) This worked on all platforms that I tested. However a user found that if libcurl was configured --without-libz (which is uncommon) the libz_version string does not get set by libcurl and is always NULL. I had not foreseen this and it would lead to a segfault. I think making mkString() return NA for null strings lead to the most robust behavior. Raising an exception seems a little harsh to me, as there is no way the user would be able to recover from this, and there might not be an actual problem at all. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: mkString(NULL) should be NA
On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms wrote: > On Wed, May 25, 2016 at 12:31 PM, Martin Maechler > wrote: > > Better than segfaulting, yes, but really agree with Bill (and > > Gabe), also for Rf_mkChar(NULL): > > I think both functions should give an error in such a case > > rather than returning NA_character_ > > > > It is an accident of some kind if they got NULL, no? > > Not necessarily. A char* of NULL can be a string which is not > initiated or simply unavailable due to configuration. > > The example from my original email was in curl package which exposes > the version string of libz that was used to build libcurl: > > mkString(data->libz_version) > > This worked on all platforms that I tested. However a user found that > if libcurl was configured --without-libz (which is uncommon) the > libz_version string does not get set by libcurl and is always NULL. I > had not foreseen this and it would lead to a segfault. > > I think making mkString() return NA for null strings lead to the most > robust behavior. Raising an exception seems a little harsh to me, as > there is no way the user would be able to recover from this, and there > might not be an actual problem at all. > > Robust in the sense of no error being thrown, but perhaps only correct by accident. NULL is not a valid C string --- should functions always return NA on invalid input? As Gabe mentions, in the cited use case, it's not clear whether the appropriate value is NA, "", or something else entirely. Generalization seems risky at this point. > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: mkString(NULL) should be NA
On Wed, May 25, 2016 at 7:22 AM, Michael Lawrence wrote: > On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms > wrote: > I'm not disagreeing with what's been said in this thread, but I can't help but recall that I brought up this exact issue probably 15 years ago and was told (by Brian, I believe) "don't do that" (pass a null pointer), which was perfectly fine. The real issue was not the behavior but that it was not documented or consistent. I've lived by the mantra since that you can never trust a pointer in R code. User must always check for NULL. I just wrote my own functions mkXXX_safe that wrap the internals and check the pointer. THK http://www.keittlab.org/ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: mkString(NULL) should be NA
On Wed, 25 May 2016, Tim Keitt wrote: On Wed, May 25, 2016 at 7:22 AM, Michael Lawrence wrote: On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms wrote: I'm not disagreeing with what's been said in this thread, but I can't help but recall that I brought up this exact issue probably 15 years ago and was told (by Brian, I believe) "don't do that" (pass a null pointer), which was perfectly fine. The real issue was not the behavior but that it was not documented or consistent. I've lived by the mantra since that you can never trust a pointer in R code. User must always check for NULL. In _C_ code. This is true whether you are calling into the R C API or any other C library: you as the C programmer need to make sure either that passing NULL is OK or make sure you don't do that. I wouldn't object to mkXXX checking for NULL and signaling an error instead of segfaulting, but good C code calling mkXXX should still typically do its own check and handle the situation in an appropriate way. Best, luke I just wrote my own functions mkXXX_safe that wrap the internals and check the pointer. THK http://www.keittlab.org/ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] odd warning unlinking symlink on Windows
While constructing some tests of symbolic link code in R, I got an odd warning when trying the remove a symbolic link: file.create(tfile <- tempfile()) #[1] TRUE file.symlink(tfile, tlink <- tempfile()) #[1] TRUE unlink(tlink) #Warning message: #In unlink(tlink) : # cannot delete reparse point 'C:\Users\wdunlap\AppData\Local\Temp\Rtmp0oB1gl\fileedc792515a3', reason 'There is a mismatch between the tag specified in the request and the tag present in the reparse point. #' file.exists(tlink) #[1] TRUE I can remove the symbolic link, without any warnings, if it is in a directory that I remove with unlink(recursive=TRUE): dir.create(tdir <- tempfile(fileext=".dir")) file.create(tfile <- file.path(tdir, "file")) #[1] TRUE file.symlink(tfile, tlink <- file.path(tdir, "symlinkToFile")) #[1] TRUE dir(tdir) #[1] "file" "symlinkToFile" print(unlink(tdir, recursive=TRUE)) #[1] 0 file.exists(tdir) #[1] FALSE file.exists(tlink) #[1] FALSE (I didn't know symlinks were even possible on Windows, but they are in Windows 7. Sys.readlink() does nothing useful on Windows.) > sessionInfo() R version 3.3.0 (2016-05-03) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] assertionTest_0.5 Bill Dunlap TIBCO Software wdunlap tibco.com [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: mkString(NULL) should be NA
http://www.keittlab.org/ On Wed, May 25, 2016 at 10:43 AM, wrote: > On Wed, 25 May 2016, Tim Keitt wrote: > > On Wed, May 25, 2016 at 7:22 AM, Michael Lawrence < >> lawrence.mich...@gene.com >> >>> wrote: >>> >> >> On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms >>> wrote: >>> >>> >> I'm not disagreeing with what's been said in this thread, but I can't help >> but recall that I brought up this exact issue probably 15 years ago and >> was >> told (by Brian, I believe) "don't do that" (pass a null pointer), which >> was >> perfectly fine. The real issue was not the behavior but that it was not >> documented or consistent. I've lived by the mantra since that you can >> never >> trust a pointer in R code. User must always check for NULL. >> > > In _C_ code. This is true whether you are calling into the R C API or > any other C library: you as the C programmer need to make sure either > that passing NULL is OK or make sure you don't do that. > I agree -- I meant it was "perfectly fine" to remind us we need to check pointers. Its really a documentation issue. THK > > I wouldn't object to mkXXX checking for NULL and signaling an error > instead of segfaulting, but good C code calling mkXXX should still > typically do its own check and handle the situation in an appropriate > way. > > Best, > > luke > > > >> I just wrote my own functions mkXXX_safe that wrap the internals and check >> the pointer. >> >> THK >> >> http://www.keittlab.org/ >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics andFax: 319-335-3017 >Actuarial Science > 241 Schaeffer Hall email: luke-tier...@uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] configure / make problems with R-devel
Hi Martin, Thanks for the Makefile clue. The file 'non-tarball' was present in the source directory, but not in the build directory (should the Makefile be checking for 'non-tarball' in the source directory instead?). However, 'doc/FAQ' was present in the source directory, so the first clause of '||' failed. svnonly: @if test ! -f "$(srcdir)/doc/FAQ" || test -f non-tarball ; then \ ... It works if I remove 'doc/FAQ'. Incidentally, this file is created by 'make' but not removed by 'make clean'. So the problem is that I accidentally ran 'make' in the source directory before taking your advice. I ran 'make clean' out of habit, not even realizing that 'make' in the source directory would break your separate-build-directory technique. STRIP=true is highly counterintuitive, but I'm surprised you don't recognise it - it just replaces the 'strip' command with 'true' so that my installed binaries get to keep their debugging symbols. Maybe there is a better way of doing it - I guess you use the binaries in place so perhaps that is one solution. I'm not sure where I came up with STRIP=true. In case anyone skimmed, I think there may be two "action items" here: 1. in Makefile, change @if test ! -f "$(srcdir)/doc/FAQ" || test -f non-tarball ; then \ to @if test ! -f "$(srcdir)/doc/FAQ" || test -f "$(srcdir)/non-tarball" ; then \ (I haven't tested it) 2. Figure out how to remove doc/FAQ with 'make clean' (if it makes sense to do so) Thanks, Frederick On Wed, May 25, 2016 at 12:46:53PM +0200, Martin Maechler wrote: > > > > on Tue, 24 May 2016 15:15:17 -0700 writes: > > > Thank you, Martin. I linked to your message in a comment here so maybe > > other people will know about that useful technique: > > > http://singmann.org/installing-r-devel-on-linux/#comment-161 > > > However, when I try it, I get an error: > > > $ make > > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/m4' > > make[1]: Nothing to be done for 'R'. > > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/m4' > > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/tools' > > make[1]: Nothing to be done for 'R'. > > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/tools' > > make[1]: Entering directory '/home/frederik/pkg-tmp/R-svn-build/doc' > > /usr/bin/install: cannot stat '../../R-svn/doc/NEWS': No such file or > directory > > /usr/bin/install: cannot stat '../../R-svn/doc/NEWS.pdf': No such file > or directory > > Makefile:164: recipe for target 'svnonly' failed > > make[1]: *** [svnonly] Error 1 > > make[1]: Leaving directory '/home/frederik/pkg-tmp/R-svn-build/doc' > > Makefile:60: recipe for target 'R' failed > > make: *** [R] Error 1 > > This is strange: Did you accidentally delete the 'non-tarball' > file in your build directory which should have been created by > 'configure' ? > > I ask because your 'make' seems to be using the 'else' clause in > the 'svnonly' target in the R-svn-build/Makefile > but it should really use the first branch which does install > things in ./doc/ (such as NEWS or NEWS.pdf). > > I have never used 'STRIP=true' -- maybe that did remove the > 'non-tarball' file ? > > Why not rather do it the way I told you, i.e., *with* recommended > packages, and no arguments to 'configure' > (if that does work, you may try variants.. I agree that > --without-recommended-packages should work as well, I just never > use that). > > Martin > > > > I configured like this: > > > $ cd ../R-svn-build/ > > $ ../R-svn/configure --without-recommended-packages > --prefix=$HOME/r-svn-test STRIP=true > > > I guess I can try to debug it myself but thought I should report back > > to you. It works when I 'configure' and 'make' in the source > > directory. > > > Cheers, > > > Frederick > > > > > On Tue, May 24, 2016 at 07:20:18PM +0200, Martin Maechler wrote: > >> > Keith O'Hara > >> > on Tue, 24 May 2016 12:47:43 -0400 writes: > >> > >> > svn checkout https://svn.r-project.org/R/trunk/ > >> > >> yes, indeed. thank you, Keith. > >> > >> and from then on only > >> > >> cd > >> svn up > >> > >> (which is short for 'svn update'). > >> > >> Another hint: Then do *not* build in the source directory but in > >> what we called a "build directory"; i.e., something like > >> (from scratch; including the only-once needed "checkout") : > >> > >> svn checkout https://svn.r-project.org/R/trunk/ R > >> cd R > >> tools/rsync-recommended > >> mkdir ../build-R > >> cd ../build-R > >> ../R/configure > >> make > >> make check > >> > >> and I then never run 'make install', but rather use symbolic > >> link from > >> /b
Re: [Rd] Suggestion: mkString(NULL) should be NA
Hi, I tend to agree with the objections expressed earlier. I would only add that making the NULL pointer semantically equivalent to NA would introduce a precedent that could lead to some confusion. For example it would set the expectation that CHAR(Rf_mkChar(NULL)) is NULL, which is not the case AFAIK. Or that low-level string manipulation utilities that take a C-string as input (e.g. Rf_reEnc()) accept NULL and propagate it. Of course these things can be modified to be consistent with the new "NULL pointer == NA" paradigm but this might end up being a bigger move than what it seems at first sight... Cheers, H. On 05/25/2016 08:43 AM, luke-tier...@uiowa.edu wrote: On Wed, 25 May 2016, Tim Keitt wrote: On Wed, May 25, 2016 at 7:22 AM, Michael Lawrence wrote: On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms wrote: I'm not disagreeing with what's been said in this thread, but I can't help but recall that I brought up this exact issue probably 15 years ago and was told (by Brian, I believe) "don't do that" (pass a null pointer), which was perfectly fine. The real issue was not the behavior but that it was not documented or consistent. I've lived by the mantra since that you can never trust a pointer in R code. User must always check for NULL. In _C_ code. This is true whether you are calling into the R C API or any other C library: you as the C programmer need to make sure either that passing NULL is OK or make sure you don't do that. I wouldn't object to mkXXX checking for NULL and signaling an error instead of segfaulting, but good C code calling mkXXX should still typically do its own check and handle the situation in an appropriate way. Best, luke I just wrote my own functions mkXXX_safe that wrap the internals and check the pointer. THK http://www.keittlab.org/ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel