> > Color me puzzled. Can you express the run more clearly in Boolean logic? > > Its a bit tedious to explain in Boolean logic..
Suppose the data is subsetted according to two distinct 'clm' variables (i.e 1 set consisting of only "General" & other only of "Life") * General.dat* * * id loc clm 1 A1 B1 General 2 A2 B2 General 3 A3 B3 General 4 A4 B4 General 5 A5 B5 General 6 A3 B1 General 7 A3 B3 General 8 A3 B3 General 9 A4 B4 General *Life*.*dat* id loc clm 1 A2 B2 Life 2 A3 B3 Life 3 A4 B4 Life 4 A5 B5 Life Basically, the records in General.dat & Life.dat with same color (matching pairs) are created in one data set. The remaing records form other data set. (Although row 7 & 8 in General.dat are matched with row 2 in Life.dat, these are duplicates of row 3 in General.dat that require further attention. Similarly for row 9 in General.dat) > If someone has five policies: 3 Life and 2 General ... is he in or out? > > He is in with 1 life policy as long as the policies are identical (i.e same 'id' & 'loc' values). > Applying the alternate strategy to that data set I get: > out <- tapply( dat$clm, dat$uid, paste ,collapse=",") > > > > out > A1.B1 A2.B2 > A3.B1 > "General" "General,Life" > "General" > A3.B3 A4.B4 > A5.B5 > "General,Life,General,General" "General,Life,General" > "General,Life" > > Please explain why you want A3.B3. > > A3.B3 (2 records) & A4.B4 (1 record) are required to examine matching (between 'General' & 'Life' with identical 'ID' and 'loc') duplicated records. > -- > David. > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.