Dear all,
Thanks to Jim and Mark for suggesting including the reproducible
code. Please note that the enclosed file would need to go to into the
home folder or that the path for reading the CSV file be changed. I
hope no encoding issues emerge when reading it.
And the code
library(Hmisc) #need the cut2 function to mark the quantile a given
line belongs to
a <- read.csv(file = "~/example.csv", colClasses=c("Date","numeric"))
#beware of the path
dim(a) #should give "[1] 5076 2"
aggregate(a$value, list(Date = a[,"Date"],Quantile=cut2(a
$value,g=10)),sum) #should give the sum by year but on the quantiles
for the whole population
aggregate(a$value, list(Date = a[,"Date"],Quantile=tapply(a
$value,use.filter$Date,cut2,g=10)),sum) #gives error mentioned below
Once again, many thanks for any help
Ivan
On 21 Oct 2008, at 02:40, jim holtman wrote:
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
You need to at least post a subset of your data so that we can
understand the data structures that you are using. 'dput' will create
an easily readable format for posting your data (much easier than if
you post the listing of a table). Usually it is some 'type mismatch'
which says you really have to have the data to run the script against.
On Mon, Oct 20, 2008 at 6:38 PM, Ivan Alves <[EMAIL PROTECTED]> wrote:
Dear all,
I would like to aggregate a data frame (consisting of 2 columns - one
for the bins, say factors, and one for the values) along bins and
quantiles within the bins.
I have tried
aggregate(data.frame$values, list(bin = data.frame
$bin,Quantile=cut2(data.frame$bin,g=10)),sum)
but then the quantiles apply to the population as a whole and not the
individual bins. Upon this realisation I have tried
aggregate(data.frame$values, list(bin = data.frame
$bin,Quantile=tapply(data.frame$values,data.frame
$bin,cut2,g=10)),sum)
which gives the following error:
Error in sort.list(unique.default(x), na.last = TRUE) :
'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
clearly I am doing something wrong, but cannot figure out what. I
believe the error stems either from a. the output of tapply being a
list of a dimension equal to the number of bins, and not a list of
equal dimension as the values, or b. that somehow aggregate does not
like that the second list (of the quantiles within the bins are not
sorted nicely)
1. Do you have a reference for doing the summation on both bins and
quantiles within the bins?
2. If not, can you give me some guidance as to what I am doing wrong
and how I can solve the sort/list issue?
Any help would be greatly appreciated
Kind regards,
Ivan Alves
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.