Re: [R] Data permutation

Gabor Grothendieck Mon, 13 Dec 2010 09:19:49 -0800

On Mon, Dec 13, 2010 at 11:37 AM, matteop <mpr...@iese.edu> wrote:
>
> Hello R User,
>
> I am new in R and trying to migrate from SAS. I have to convert a table that
> look like this
>
> YEAR    FIRM            ID_NAME         VALUE
> 1994    Microsoft       John Doe                5
> 1994    Microsoft       Mark Smith              3
> 1994    Microsoft       David Ring              2
>
> In this:
> YEAR    FIRM            ID1             vALUE   ID2             VALUE
> 1994    Microsoft       John Doe        5               Mark Smith            
>   3
> 1994    Microsoft       John Doe        5               David Ring            
>   2
> 1994    Microsoft       Mark Smith      3               David Ring            
>   2
>
> I have to do it for all the possible pair combination of ID_Name linked to
> the same firm for any given year in my sample.
> Do you have any suggestion?
>


Here are a few possibilities:

1. merge/subset

subset(merge(DF, DF, by = 1:2), as.character(ID_NAME.x) <
as.character(ID_NAME.y))

2. sqldf with default names

library(sqldf)
sqldf("select * from DF a join DF b using(YEAR, FIRM)
  where a.ID_NAME < b.ID_NAME", method = "raw")

Its important that you use method = "raw" to override the automatic
class assignment heuristic which in this case tries to assign factors
to the ID_NAME columns but gets the factor levels wrong.  If you use
method = "raw" it should work ok here.

3. sqldf with new names

This also works and does not need method = "raw":

sqldf("select YEAR, FIRM,
   a.ID_NAME ID_NAME1, a.VALUE VALUE1,
   b.ID_NAME ID_NAME2, b.VALUE VALUE2
   from DF a join DF b using(YEAR, FIRM)
   where a.ID_NAME < b.ID_NAME")

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data permutation

Reply via email to