[R] Problem with "merge" command duplicating values

Archana Dayalu Wed, 22 Jul 2009 12:46:15 -0700

Hello,
I am attempting to merge 8 different data sets into a "grand merge" data
set; all their variable names are common except for the the gas measured.
However, when I did a quick stat summary comparison of merged data with
unmerged data, it turned out that R mysteriously duplicated thousands of
values in the merged set and I have no idea why. I've not had this problem
with merge in the past.... any thoughts?


To illustrate:

given the following objects (as data frames) with 1 unique and 10 common
variables:
h2_flasks
co2c13_flasks
co2o18_flasks
ch4_flasks
co2_flasks
co_flasks
n2o_flasks
co2c14_flasks

#Merge objects into one data frame ("grand merge"):
>obj.list <- ls(pattern='flasks')
>grand.merge<-merge(get(obj.list[1]),get(obj.list[2]),all=TRUE)
>for (ss in 3:length(obj.list)){
    grand.merge<-merge(grand.merge,get(obj.list[ss]),all=TRUE)
    }

#CH4 data extracted from grand merge
>length(na.omit(grand.merge$CH4))
[1] 29027

#Unmerged CH4 data only (from object ch4_flasks)
> length(na.omit(ch4_flasks$CH4))
[1] 23739

#So 5000+ CH4 values are mysteriously "added" to the grand merge file. This
"duplicated value" problem occurs for all gas variables in the grand merged
data, in varying degrees. (For example, H2 had only 2 extra values
mysteriously added).

Thanks very much for any input.
Archana

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with "merge" command duplicating values

Reply via email to