Hello everyone,

I've come up with a problem with using paste() inside apply() that I
can't seem to solve.
Briefly, if I'm using paste to collapse the rows of a data frame, AND
the data frame
contains strings with spaces, AND there are NA values in subsequent
columns, then
paste() introduces spaces. This only happens with that particular combination of
data values and commands. I have a workaround - replacing NA with "NA" - but
this seems odd.

Thanks for any thoughts,
Sarah


R --vanilla
# R version 2.9.0 (2009-04-17)
# Fedora Core 10

> test1 <- data.frame(A = rep(1, 5), B = rep("a", 5), C = rep("a b", 5), D = 
> rep(2, 5), stringsAsFactors=FALSE)
>
> # has an NA value in a column before the column containing strings with spaces
> test2 <- test1
> test2$B[4] <- NA
>
> # has an NA value in a column after the column containing strings with spaces
> test3 <- test1
> test3$D[4] <- NA

> str(test1)
'data.frame':   5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" "a" ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 2 2
> str(test2)
'data.frame':   5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" NA ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 2 2
> str(test3)
'data.frame':   5 obs. of  4 variables:
 $ A: num  1 1 1 1 1
 $ B: chr  "a" "a" "a" "a" ...
 $ C: chr  "a b" "a b" "a b" "a b" ...
 $ D: num  2 2 2 NA 2

> # works as expected
> apply(test1, 1, paste, collapse=",")
[1] "1,a,a b,2" "1,a,a b,2" "1,a,a b,2" "1,a,a b,2" "1,a,a b,2"

> # works as expected
> # does NOT add spaces to the column with the NA value
> apply(test2, 1, paste, collapse=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,NA,a b,2" "1,a,a b,2"

> # introduces spaces in the column with the NA value
> # only if that column is after a column that contains strings with spaces
> apply(test3, 1, paste, collapse=",")
[1] "1,a,a b, 2" "1,a,a b, 2" "1,a,a b, 2" "1,a,a b,NA" "1,a,a b, 2"

> # pasting the columns together manually works as expected
> paste(test3$A, test3$B, test3$C, test3$D, sep=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,NA" "1,a,a b,2"

> # pasting a single row works as expected
> paste(test3[3,], collapse=",")
[1] "1,a,a b,2"

## workaround
> test3[is.na(test3)] <- "NA"
> apply(test3, 1, paste, sep="", collapse=",")
[1] "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,2"  "1,a,a b,NA" "1,a,a b,2"



-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to