>>>>> Gabriel Becker <[email protected]>
>>>>> on Tue, 24 May 2016 10:30:48 -0700 writes:
> On Tue, May 24, 2016 at 9:30 AM, Jeroen Ooms <[email protected]>
> wrote:
>> On Tue, May 24, 2016 at 5:59 PM, Gabriel Becker <[email protected]>
>> wrote:
>> > Shouldn't Rf_mkString(NULL) return (the c-level equivalent of)
>> character()
>> > rather than the NA_character_?
>>
>> No. It should still be safe to assume that mkString() always returns a
>> character vector of exactly length one. Anything else could lead to
>> type errors.
>>
> Well the thing is you're passing an invalid pointer, that doesn't point to
> a C string, to a constructor expecting a valid const char *. I'm fine with
> the contract being that mkString always returns a character vector of
> length one, but that doesn't necessarily mean that the function needs to
> accept NULL pointers. The contract as I understand it is that if you give
> it a C string, it will create a CHARSXP for that string. In this light,
> Bill's suggestion that it throw an error seems the most principled
> response. I would think you would need to at the very least emit a
warning.
I agree with Jerooen that mkChar() and mkString() may be
used in contexts where they can end up with a NULL and hence
should not segfault... and hence am willing the extra (very
small) penalty of checking for NULL.
>>
>> > An empty string and NULL aren't the same.
>>
>> Exactly! So if you pass in an empty C string, you get an empty R
>> string, and if you pass in a null pointer you get NA.
>>
>> Rf_mkString(NULL) <--> NA
>> Rf_mkString("") <--> ""
>>
>> There is no ambiguity, and much better than segfaulting.
Better than segfaulting, yes, but really agree with Bill (and
Gabe), also for Rf_mkChar(NULL):
I think both functions should give an error in such a case
rather than returning NA_character_
It is an accident of some kind if they got NULL, no?
--
Martin Maechler,
ETH Zurich
> Well, better than segfaulting is not really relevant here. No one is
> arguing that it should segfault. The question is what behavior it should
> have when it doesn't segfault.
> It's true that a C empty string is not the same as NULL, but NULL isn't
the
> same as NA either. Semantically, for your use-case (which I gather arose
> from interactions we had :) ) the NULL means there is no version, while NA
> indicates there is a version but we don't know what it is. Imagine an
> object class that represents a persons name (first, middle, last). Now
take
> two people, One has no middle name (and we know that when creating the
> object) and another for whom we don't have any information about the
middle
> name, only first and last were reported. I would expect the first one to
> have middle name either NULL or (in a data.frame context) "", while the
> second would have NA_character_. In this light, mkString should arguably
> generate "". i don't think the fact that there is another way to get "" is
> a particularly large problem.
> On the other hand, and in support of your position it came up as Michael
> Lawrence and I were talking about this that asChar from utils.c will give
> you NA_STRING when you give it R_NilValue. That is a coercion though,
> whereas arguably mkString is not. That said, consistency would probably be
> good.
> ~G
> --
> Gabriel Becker, PhD
> Associate Scientist (Bioinformatics)
> Genentech Research
> [[alternative HTML version deleted]]
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel