On Jun 1, 2011, at 7:00 PM, kristina p wrote:

Dear R Team,

I am a new R user and I am currently trying to subset my data under a
special condition. I have went through several pages of the subsetting
section here on the forum, but I was not able to find an answer.

My data is as follows:

ID                      NAME       MS     Pol. Party
1                           John       x       F
2                           Mary       s       S
3                           Katie      x       O
4                           Sarah      p       L
5                           Martin      x      O
6                           Angelika   x      F
7                            Smith      x      O
....

Assume this is in a dataframe, 'pol', and that you have corrected the error in colnames, so that it is Pol_Party. the ave function is particularly useful when you need to have a vector that "lines up along side" the other columns

pol[ave(seq_along(pol$ID), pol$Pol_Party, FUN=length) >= 3 , ]
  ID   NAME MS Pol_Party
3  3  Katie  x         O
5  5 Martin  x         O
7  7  Smith  x         O

(The use of seq_along ensures you will get duplicates of ID that are in any qualifying Parties.

Another way to generate the values would be to table()-ulate and pick out the names of qualifying Parties:

> pol[ pol$Pol_Party %in% names(tabl.party)[tabl.party >= 3], ]
  ID   NAME MS Pol_Party
3  3  Katie  x         O
5  5 Martin  x         O
7  7  Smith  x         O


I am intested in only those observations, where there are at least three members of 1 political party. That is, I need to throw out all cases in the
example above, except for members of party "O".

Both methods use logical indexing with the "[.data.frame" function,


Would really appreciate your help.

--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to