Hi Jon, * On 2008-04-28 at 11:00 +0100 Jon Clayden wrote: > A piece of my code that uses readBin() to read a certain file type is > behaving strangely with R 2.7.0. This seems to be because of a failure > to match() strings after using rawToChar() when the original was > terminated with a "\0" character. Direct equality testing with == > still works as expected. I can reproduce this as follows: > > > x <- "foo" > > y <- c(charToRaw("foo"),as.raw(0)) > > z <- rawToChar(y) > > z==x > [1] TRUE > > z=="foo" > [1] TRUE > > z %in% c("foo","bar") > [1] FALSE > > z %in% c("foo","bar","foo\0") > [1] FALSE > > But without the nul character it works fine: > > > zz <- rawToChar(charToRaw("foo")) > > zz %in% c("foo","bar") > [1] TRUE > > I don't see anything about this in the latest NEWS, but is this > expected behaviour? Or is it, as I suspect, a bug? This seems to be > new to R 2.7.0, as I said.
The short answer is that your example works in R-2.6 and in the current R-devel. Whether the behavior in R-2.7 is a bug is perhaps in the eye of the beholder. Historically, R's internal string representation allowed for embedded nul characters. This was particularly useful before the raw vector type, RAWSXP, was introduced. Since the vast majority of R's internal string processing functions use standard C semantics and truncated at first nul there has always been some room for "interesting" behavior. The change in R-2.7 was an attempt to start resolving these inconsistencies. Since then the core team has agreed to remove the partial support for embedded nul in character strings -- raw can be used when this is desired, and having nul terminated strings will make the code more consistent and easier to maintain going forward. Best Wishes, + seth -- Seth Falcon | http://userprimary.net/user/ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel