Hi > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Sam Steingold > Sent: Thursday, September 06, 2012 3:43 PM > To: David Winsemius > Cc: r-help@r-project.org > Subject: Re: [R] merge a list of data frames > > > * David Winsemius <qjvafrz...@pbzpnfg.arg> [2012-09-05 21:02:16 - > 0700]: > > > > On Sep 5, 2012, at 8:51 PM, Sam Steingold wrote: > > > >> I have a list of data frames: > >> > >>> str(data) > >> List of 4 > >> $ :'data.frame': 700773 obs. of 3 variables: > >> ..$ V1: chr [1:700773] "200130446465779" "200070050127778" > >> "200030633708779" "200010587002779" ... > >> ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... > >> ..$ V3: num [1:700773] 1 1 1 1 1 ... > >> $ :'data.frame': 700773 obs. of 3 variables: > >> ..$ V1: chr [1:700773] "200130446465779" "200070050127778" > >> "200030633708779" "200010587002779" ... > >> ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... > >> ..$ V3: num [1:700773] 1 1 1 1 1 ... > >> $ :'data.frame': 700773 obs. of 3 variables: > >> ..$ V1: chr [1:700773] "200130446465779" "200070050127778" > >> "200030633708779" "200010587002779" ... > >> ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... > >> ..$ V3: num [1:700773] 1 1 1 1 1 ... > >> $ :'data.frame': 700773 obs. of 3 variables: > >> ..$ V1: chr [1:700773] "200160325893778" "200130647544079" > >> "200130446465779" "200120186959078" ... > >> ..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ... > >> ..$ V3: num [1:700773] 1 1 1 1 1 1 1 1 1 1 ... > >> > >> I want to merge them. > > > > Why? What are you expecting? > > these are the results of applying a model to the test data. > the first column is the ID > the second column is the actual value > the third column is the model score > > after I will merge the frames, I will > 1. check that all the V2 columns are identical and drop all but one (I > guess I could just merge on c("V1","V2") instead, right?)
colSums(apply(do.call(cbind,lapply(data, "[", "V2")),1,diff)!=0) shall give you 0 if there is no difference > > 2. compute the sum (or the mean, whatever is easier) of all the V3 > columns sapply(lapply(data, "[", "V3"), sum) sapply(lapply(data, "[", "V3"), mean) shall give you table of means or sums. Sorting them is straightforward The most tedious part of my response was to prepare toy data. So please, maybe you shall be kind to us to provide them by an appropriate way dput(header(data)) Regards Petr > 3. sort by the sum/mean of the V3 columns and evaluate the combined > model using the lift quality metric > (http://dl.acm.org/citation.cfm?id=380995.381018) > > I have many more score files (not just 4), so it is not practical for > me to rename the column to something unique. > > > > -- > Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X > 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org > http://truepeace.org http://jihadwatch.org http://mideasttruth.com > http://americancensorship.org To be popular with ladies one has to be > smart, handsome & rich. Or to be a cat. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.