On 04/15/2014 06:24 AM, umair durrani wrote:
Hi, I have a big data frame with millions of rows and more than 20 columns. Let
me first describe what the data is to make question more clear. The original
data frame consists of locations, velocities and accelerations of 2169 vehicles
during a 15 minute period. Each vehicle has a unique Vehicle.ID, an ID of the
time frame in which it was observed i.e. Frame.ID, the velocity of vehicle in
that frame i.e. svel, the acceleration of vehicle in that frame i.e. sacc and
the class of that vehicle, vehicle.class, i.e. 1= motorcycle, 2= car, 3 =
truck. These variables were recorded after every 0.1 seconds i.e. each frame is
0.1 seconds. Here are the first 6 rows:
dput(head(df))structure(list(Vehicle.ID = c(2L, 2L, 2L, 2L, 2L, 2L), Frame.ID = 133:138,Vehicle.class = c(2L, 2L, 2L, 2L, 2L, 2L), Lane = c(2L, 2L,
2L, 2L, 2L, 2L), svel = c(37.29, 37.11, 36.96, 36.83, 36.73,36.64), sacc = c(0.07, 0.11, 0.15, 0.19, 0.22, 0.25)), .Names = c("Vehicle.ID",
"Frame.ID", "Vehicle.class", "Lane", "svel", "sacc"), row.names = 7750:7755, class =
"data.frame")
There are some instances in vehicles' journey during the 15 minute recording
period that they completely stop i.e. svel==0. This continues for some frames
and then vehicles gain speed again. For the purpose of reproduciblity I am
creating an example data set as follows:
x<- data.frame(Vehicle.ID = c(rep(10,5), rep(20,5), rep(30,5), rep(40,5),
rep(50,5)),vehicle.class = c(rep(2,10), rep(3,10),rep(1,5)), svel =
rep(c(1,0,0,0,3),5), sacc = rep(c(0.3,0.001,0.001,0.002,0.5),5))
As described above some vehicles stop and have zero velocity for some time but
later accelerate to get up to speed. I want to find the acceleration, sacc they
apply after having zero velocity for some time (moving from standstill
position). This means that I should be able to look at the FIRST row AFTER the
last frame in which svel==0. In the example data this means that the car
(vehicle.class==2) having a Vehicle.ID==10 had a velocity, svel equal to 1 as
seen in the first row. Later, it stopped for 3 frames (3 consecutive rows) and
then accelerated to velocity, svel, equal to 3. I want the acceleration sacc it
applied in those 2 frames (rows 4 and 5 for vehicle 10, which come out to be
0.002 and 0.500). This means that for example data, following should be the
output by vehicle.class:
output<- data.frame(Vehicle.ID = c(10,10,20,20,30,30,40,40,50,
50),vehicle.class = c(2,2,2,2,3,3,3,3,1,1), xf = rep(c('l','f'),10),sacc =
rep(c(0.002,0.500),5))
xf identifies the last row l in which svel==0 and f is the first one after
that. I have tried using plyr and for loop to split by vehicle.class but am not
sure how to extract the sacc. Please note that xf should be a part of output.
It is not in given data. The original data frame df has 2169 vehicles, some
stopped and some did not so not all vehicles had svel==0. The vehicles which
did stop didn't do it at the same time. Also, the number of rows in which
svel==0 is different vehicle to vehicle.
Thanks,
Umair Durrani
Master's candidate
Civil and Environmental Engineering
University of Windsor
[[alternative HTML version deleted]]
Hi Umair,
This may be a bit slow, but I think it will do what you want:
initacc<-function(x) {
xout<-matrix(rep(NA,4),nrow=1)
for(drow in 2:dim(x)[1]) {
if(x[drow-1,"svel"] == 0 && x[drow,"svel"] > 0) {
if(!is.na(xout[1,1])) {
xout<-rbind(xout,c(x[drow-1,"Vehicle.ID"],
x[drow-1,"vehicle.class"],0,x[drow-1,"sacc"]))
}
else {
xout[1,]<-c(x[drow-1,"Vehicle.ID"],
x[drow-1,"vehicle.class"],0,<-x[drow-1,"sacc"])
}
xout<-rbind(xout,c(x[drow,"Vehicle.ID"],
x[drow,"vehicle.class"],1,x[drow,"sacc"]))
}
}
xout<-as.data.frame(xout)
names(xout)<-
c("Vehicle.ID","vehicle.class","xf","sacc")
xout$xf<-ifelse(xout$xf,"f","l")
return(xout)
}
Jim
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.