Richard -
Is this what you're looking for?
sdata = data.frame(company=sample(LETTERS[1:8],1000,replace=TRUE),
person=1:1000,
salary=rnorm(1000))
splitsdata = split(sdata,sdata$company)
res = do.call(rbind,sapply(splitsdata,simplify=FALSE,
function(x)x[order(x$salary,decreasing=TRUE),][1:5,]))
row.names(res) = NULL
res
company person salary
1 A 560 2.721923
2 A 538 2.456439
3 A 594 2.093376
4 A 947 1.960166
5 A 334 1.544756
6 B 671 2.484698
7 B 533 2.328799
. . .
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
On Thu, 16 Sep 2010, Tan, Richard wrote:
Hi Richard
Thanks for the suggestion, but I want top 5 salary for each company, not
the whole list. I don't see how your way can work?
Thanks,
Richard
From: RICHARD M. HEIBERGER [mailto:r...@temple.edu]
Sent: Thursday, September 16, 2010 11:53 AM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] get top n rows group by a column from a dataframe
tmp <- data.frame(matrix(rnorm(30), 10, 3,
dimnames=list(letters[1:10],
c("company", "person",
"salary"))))
tmp
company person salary
a -1.04590176 -0.7841855 1.07150503
b -1.06643101 0.6545647 0.43920454
c 0.72894531 -1.3812867 0.41313659
d -0.39265263 -0.3871271 0.69404325
e 0.54028124 0.7124772 0.66630904
f -1.46931714 -0.3823353 0.03069797
g -0.33283666 -0.6351862 0.37920017
h -0.79977129 0.2605315 0.92373900
i 0.80614119 0.3727227 -1.16560563
j 0.03165012 0.4690400 -0.81966285
order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))]
[1] 5 2 10 9 8
tmp[order(tmp$person, decreasing=TRUE)[1:min(5, length(tmp$person))],]
company person salary
e 0.54028124 0.7124772 0.6663090
b -1.06643101 0.6545647 0.4392045
j 0.03165012 0.4690400 -0.8196628
i 0.80614119 0.3727227 -1.1656056
h -0.79977129 0.2605315 0.9237390
You can easily write a function for that.
top <- function(DF, varname, howmany) {}
On Thu, Sep 16, 2010 at 11:39 AM, Tan, Richard <r...@panagora.com>
wrote:
Hi, is there an R function like sql's TOP key word?
I have a dataframe that has 3 columns: company, person, salary
How do I get top 5 highest paid person for each company, and if
I have
fewer than 5 people for a company, just return all of them?
Thanks,
Richard
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.