I've got a simple data.frame of a facotr variable called 'case' which indicates
one subject and a date of an event ('obs'), each row representing an
observation. One case can have many (or few) observations over time in the data
set.
I've created a crude data.frame by way of a clunky but reproducible example.
My objective is simply to create a variable that captures a rank of the
occurrence of the events for each case in date order, 1 being the first up to n
being the nth. To this end I've used the 'ave' command as below.
set.seed(66)
d<-(seq(as.Date("2001/01/01"),as.Date("2011/12/31"),"days"))
obs<-(as.Date(sample(d,200,replace=TRUE)))
obs<-as.data.frame(obs)
case<-(case=(sample(LETTERS[1:8],200,replace=TRUE)))
case<-as.data.frame(case)
df<-cbind(case,obs)
df$rank<-ave(df$obs,df$case, FUN=rank)
This throws one of those "Error in as.Date.numeric(value) : 'origin' must be
supplied" errors
I get why this is happening, that I have not explicitly set the date origin
when I set up the date variables, but my question is where do I do this? I've
tried variations of the above where I've used an origin="1900-01-01".in various
lines in the above code but I am still getting the error.
Also by way of a supplementary question, in my actual application I am bringing
in a lot of data from .csv files which contain data originally generated by the
data owner in excel, so does this mean that I need to always set the origin at
1st Jan 1900?
Any help gratefully recieved,
GavinR
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.