Thanks. I finally got around to implementing it and it works. But I think the steps to produce master_reduced can be compressed into
master_reduced = merge(master,control) > master clientId date value 1 1 1001 10001 2 2 1002 10002 3 3 1003 10003 4 4 1004 10004 5 2 1005 10005 > control clientId mindate maxdate control.params 1 2 100 1005 1 2 3 1005 1005 2 > merge(master,control) clientId date value mindate maxdate control.params 1 2 1002 10002 100 1005 1 2 2 1005 10005 100 1005 1 3 3 1003 10003 1005 1005 2 with the added advantage that clientId doesn't occur twice. Is this just coincidence or can I use this technique reliably for merges of this sort? > master_reduced clientId date value clientId mindate maxdate control.params 2 2 1002 10002 2 100 1005 1 3 3 1003 10003 3 1005 1005 2 5 2 1005 10005 2 100 1005 1 On Jan 21, 5:20 am, "Moritz Grenke" <r-l...@360mix.de> wrote: > #dummy data: > master=as.data.frame(list(clientId=c(1:4,2), date=1001:1005, > value=10001:10005)) > control=as.data.frame(list(clientId=c(2,3), mindate=c(100,1005), > maxdate=c(1005,1005), control.params=c(1,2))) > > #reducing master df: > #generating "TRUE FALSE index": > idIndex=master$clientId %in% control$clientId > > #choose only those lines where index==TRUE > master_reduced=master[idIndex,] > master_reduced > > #merging dfs: > mergingIndex= match(master_reduced$clientId, control$clientId) > master_reduced=cbind(master_reduced, control[mergingIndex,]) > master_reduced > > #finally choose those lines where date is in range > dateIndex=master_reduced$date>master_reduced$mindate & > master_reduced$date<master_reduced$maxdate > finalDF=master_reduced[dateIndex,] > finalDF > > Hope this helps > Moritz > _________________________ > Moritz Grenkehttp://www.360mix.de > > -----Ursprüngliche Nachricht----- > Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im > Auftrag von analys...@hotmail.com > Gesendet: Freitag, 21. Januar 2011 03:02 > An: r-h...@r-project.org > Betreff: [R] data and parameters > > (1) I have a master data frame that reads > > ClientID |date |value > > (2) I also have a control data frame that reads > > Client ID| Min date| Max date| control parameters > > The control data set may not have all client IDs . > > I want to use the control data frame on the master data frame to > remove client IDS that don't exist in the control data set and for > those that do, remove dates outside the required range. > > (3) We can either put the control parameters on all rows corresponding > to a client ID or look it up from the control data frame > > (4) The basic function call looks like > > do.something(df,control parameters) > > where df is the subset of the master data set that corresponds to a > single client with unwanted dates removed and the control parameters > pertain to that client. > > Any help would be appreciated. > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.