Hi,

I have recorded online/offline timestamps per user that looks like this:


username,online_time,offline_time

a,2011-11-01 16:16:56.692572+01,2011-11-01 21:06:16.388903+01

a,2011-11-01 21:07:14.204367+01,2011-11-01 21:34:21.47081+01

a,2011-11-01 21:38:09.501356+01,2011-11-01 21:53:45.272321+01

For each user I want to get a probability distribution over the day, i.d.
for each minute of a day I want the probability that the user is online.

I have come up with some helper functions that let me find the minute of
the day and the duration of the online session:


data <- read.table("availability.csv", header=T, sep=",")

diff_online <- function(username)

{

  user_on <- strptime(data$online_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");

  user_off <- strptime(data$offline_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");

  difftime(user_off, user_on, units="mins");

}



 min.of.day <- function(dtstr) # minute of day

{

  dt <- strptime(dtstr, format="%Y-%m-%d %H:%M:%S");

  h <- as.integer(strftime(dt, "%H"));

  m <- as.integer(strftime(dt, "%M"));

  s <- as.integer(strftime(dt, "%OS"));



  h*60+m

}

But there I am stuck. I thought of creating a factor of the minutes a user
is online and use that to calculate a density, and had a couple other
ideas. But I strongly feel that there is some more straightforward solution
available in R.

Thanks for any help,
wr

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to