I've got a simple data.frame of a facotr variable called 'case' which indicates one subject and a date of an event ('obs'), each row representing an observation. One case can have many (or few) observations over time in the data set.
I've created a crude data.frame by way of a clunky but reproducible example. My objective is simply to create a variable that captures a rank of the occurrence of the events for each case in date order, 1 being the first up to n being the nth. To this end I've used the 'ave' command as below. set.seed(66) d<-(seq(as.Date("2001/01/01"),as.Date("2011/12/31"),"days")) obs<-(as.Date(sample(d,200,replace=TRUE))) obs<-as.data.frame(obs) case<-(case=(sample(LETTERS[1:8],200,replace=TRUE))) case<-as.data.frame(case) df<-cbind(case,obs) df$rank<-ave(df$obs,df$case, FUN=rank) This throws one of those "Error in as.Date.numeric(value) : 'origin' must be supplied" errors I get why this is happening, that I have not explicitly set the date origin when I set up the date variables, but my question is where do I do this? I've tried variations of the above where I've used an origin="1900-01-01".in various lines in the above code but I am still getting the error. Also by way of a supplementary question, in my actual application I am bringing in a lot of data from .csv files which contain data originally generated by the data owner in excel, so does this mean that I need to always set the origin at 1st Jan 1900? Any help gratefully recieved, GavinR ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.