Hi:

Here's one approach to the problem you posed (don't know if it is what you
want for the problem you intend you use it on...)

df1 <- read.table(textConnection("
id          sex        age         area
01         male      adult       LP
01         male      adult       LP
01         male      adult       LP
02       female    subadult   LP
02       female    subadult   LP
02       female    subadult   LP
02       female    subadult   LP
03       male      subadult    MR
03       male      subadult    MR
03       male      subadult    MR
03       male      subadult    MR"), stringsAsFactors = FALSE, header =
TRUE)
closeAllConnections()
df1$id <- rep(paste('0', 1:3, sep = ''), c(3, 4, 4))   # replace id with
character var

df2 <- data.frame(id = paste('0', 1:6, sep = ''), stringsAsFactors = FALSE)

df <- df1[df1$id %in% df2$id, ]     # pick out ids that match those in
df2$id
df[!duplicated(df$id), ]                   # remove duplicate rows

  id    sex      age area
1 01   male    adult   LP
4 02 female subadult   LP
8 03   male subadult   MR

I arranged the two data frames so that id was a character vector, in order
to support the leading 0.
I'm assuming your id variable is character - whichever class it is, make
sure it's consistent in both data frames. You can use str() to check the
types of your columns in each data frame.

HTH,
Dennis

On Wed, Feb 9, 2011 at 6:09 AM, Nathaniel <nathanielr...@hotmail.com> wrote:

>
> Hi R users,
>
> I am trying to extract some attributes (age, sex, area) from dataframe "AB"
> that has 101,269 observations of 28 variables to dataframe "t2" that has 47
> observations of 6 variables.  They share a column called "id", which is a
> factor with 47 levels.  I want to end up with a dataframe that has 47
> observations of 9 variables (the original 6 variables of t2, plus age, sex,
> and area). The issue I am having is that in AB has multiple entries for
> each
> id, and so I can't use merge because there is more than one match, so all
> possible matches contribute one row each--i.e., this code gives me
> dataframe
> "t3" of 101,269 observations of 33 variables:
>
> >t3<-merge(t2,AB,by="id",all=FALSE)
>
> Dataframe AB (24 variables omitted from example dataframe):
>
> id          sex        age         area
> 01         male      adult       LP
> 01         male      adult       LP
> 01         male      adult       LP
> ...
> 02       female    subadult   LP
> 02       female    subadult   LP
> 02       female    subadult   LP
> 02       female    subadult   LP
> ...
> 03       male      subadult    MR
> 03       male      subadult    MR
> 03       male      subadult    MR
> 03       male      subadult    MR
> ...
>
> Dataframe t2 (5 variables omitted from example dataframe):
>
> id
> 01
> 02
> 03
> 04
> 05
> 06
> ....
>
> This is the structure I want for dataframe t3 (5 variables omitted from
> example dataframe):
>
> id          sex        age         area
> 01         male      adult       LP
> 02       female    subadult   LP
> 03       male      subadult   MR
> ...
>
> Hopefully this all makes sense and someone knows a solution.  Thanks in
> advance for taking a look at my problem and helping out (I hope!).
>
> Nathaniel
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Need-help-merging-two-dataframes-tp3297313p3297313.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to