Sorry, my attempt wasn't quite good enough. I didn't
consider the possibility of a 'negative' value in a
character/factor column. To fix that, see inline below.
On 2010-05-17 14:32, Peter Ehlers wrote:
On 2010-05-17 12:54, Henrique Dallazuanna wrote:
Try this:
newData<- sapply(numdat, function(x)lapply(strsplit(as.character(x),
'-'),
function(.x)mean(as.numeric(.x))))
There's a potential problem if numdat contains negative numbers.
It would be better to restrict the recoding to character or
factor columns.
cl <- sapply(numdat, class)
idx <- which(cl %in% c('character','factor'))
g <- function(x){
sapply(strsplit(as.character(x),"-"),
function(.x) mean(as.numeric(.x), na.rm=TRUE))
}
Replace function g() with
g <- function(x){
sapply(strsplit(as.character(x),"-"),
function(.x) ifelse(.x[1] == "",
-as.numeric(.x[2]),
mean(as.numeric(.x)))
)
}
Since strsplit("-3", "-") produces c("", "3"), we recognize
any list component of the form c("", "a") as representing -a.
-Peter Ehlers
newData <- numdat
for(i in idx) newData[,i] <- g(newData[,i])
newData
-Peter Ehlers
On Mon, May 17, 2010 at 3:29 PM, Juliet
Hannah<juliet.han...@gmail.com>wrote:
I am recoding some data. Many values that should be 1.5 are recorded
as 1-2. Some example data and my solution is below. I am curious about
better approaches or any other suggestions. Thanks!
# example input data
myData<- read.table(textConnection("id, v1, v2, v3
a,1,2,3
b,1-2,,3-4
c,,3,4"),header=TRUE,sep=",")
closeAllConnections()
# the first column is IDs so remove that
numdat<- myData[,-1]
# function to change dashes: 1-2 to 1.5
myrecode<- function(mycol)
{
newcol<- mycol
newcol<- gsub("1-2","1.5",newcol)
newcol<- gsub("2-3","2.5",newcol)
newcol<- gsub("3-4","3.5",newcol)
newcol<- as.numeric(newcol)
}
newData<- data.frame(do.call(cbind,lapply(numdat,myrecode)))
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.