I am new to R and am having difficulty merging 2 data sets, both of which have
the same 30 variables and about 2,000 records. I've looked high and lowâPaul
Johnson's help page, through some of the 5,420 (yikes!) results for âmergeâ
on the R list archives, and google searchesâand I'm getting nowhere, so I
thought I'd ask.
When I try to merge by the ID variable:
> newdata <- merge(olddata_a, olddata_b, by = âIDâ)
⦠I get the following error:
Error: unexpected input in "newdata <- merge(olddata_a, olddata_b, by = â"
Is this error symptomatic of anything in particular? When I searched, I didn'
find any examples of this error associated with a merge.
I've also tried:
> newdata <- merge(olddata_a, olddata_b, all = TRUE)
... which doesn't give me an error, but the âmergeâ just stacks the 2
datasets on top of one another. (I have manually checked to make sure that
there are common ID numbers in both datasets and there are.) I assume that I'm
getting this stacked data because by not specifying a by variable, R is trying
to match on all variables in the datasets and there are no exaxt matches across
all the variables?
I've also tried to strip any white space that may be causing problems in the
file. (Not sure if this is a good or bad idea.)
> newdata <- read.table("olddata.csv", header = TRUE, strip.white = TRUE,
> sep=",")
I'd be really grateful for any help I could get.
Thank you.
Bill
--
William F. Mabe, PhD
Director of Research and Evaluation
Faculty Fellow
John J. Heldrich Center for Workforce Development
30 Livingston Ave., Office# 210
New Brunswick, NJ 08901
p: (732)932-4100 x6210
f: (732)932-3454
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.