Re: [R] by-group processing

Jorge Ivan Velez Wed, 06 May 2009 17:20:23 -0700

Dear Max,
By using "d" instead of "data" for your data set, here is one way:


# First order the data by ID
d <- with(d, d[order(ID),] )

# Then use tapply to get the indexes for the maximum values
d[cumsum(with(d, tapply(N, ID, which.max))),]
# ID Type N
# 7  45900    I 7
# 24 46550    I 7
# 10 49270    E 3

HTH,

Jorge


On Wed, May 6, 2009 at 6:09 PM, Max Webber <[email protected]> wrote:

> Given a dataframe like
>
>  > data
>        ID Type N
>  1  45900    A 1
>  2  45900    B 2
>  3  45900    C 3
>  4  45900    D 4
>  5  45900    E 5
>  6  45900    F 6
>  7  45900    I 7
>  8  49270    A 1
>  9  49270    B 2
>  10 49270    E 3
>  18 46550    A 1
>  19 46550    B 2
>  20 46550    C 3
>  21 46550    D 4
>  22 46550    E 5
>  23 46550    F 6
>  24 46550    I 7
>  >
>
> containing an identifier (ID), a variable type code (Type), and
> a running count of the number of records per ID (N), how can I
> return a dataframe of only those records with the maximum value
> of N for each ID? For instance,
>
>  > data
>        ID Type N
>  7  45900    I 7
>  10 49270    E 3
>  24 46550    I 7
>
> I know that I can use
>
>   > tapply ( data $ N , data $ ID , max )
>   45900 46550 49270
>       7     7     3
>   >
>
> to get the values of the maximum N for each ID, but how is it
> that I can find the index of these values to subsequently use to
> subscript data?
>
>
> --
> maxine-webber
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by-group processing

Reply via email to