You could also try:
library(plyr)
newdf <- function(.data, ...) {
eval(substitute(data.frame(...)), .data, parent.frame())
}
x1 <- ddply(mtcars,.(cyl,gear), newdf, mgp=t(quantile(mpg)),hp=t(quantile(hp)))
#(found in one of the google group discussions)
#or
library(data.table)
dt1 <- data.
Hi,
Try:
do.call(data.frame,c(x,check.names=FALSE))
A.K.
Hello,
I´m using function aggregate in R 3.0.2. If I run the instruction
x<-aggregate(cbind(mpg,hp)~cyl+gear,data=mtcars,quantile) I get the
result the following data.frame:
cyl
gear
mpg.0%
mpg.25%
mpg.50%
mpg.75%
Daniel,
You can see better what is going on if you look at
as.list(x)
There you can see that cyl and gear are vectors but mpg and hp are matrices.
You can rearrange them using the do.call() function
x2 <- do.call(cbind, x)
dim(x2)
Jean
On Fri, Nov 1, 2013 at 7:08 AM, Daniel Fernandes wrote:
Hello,
If you run the example in ?bag you can type
data(BloodBrain)
ctreeBag$aggregate
at an R prompt to see an example aggregate function. Note that it dos
_not_ have the parenthesis.
Hope this helps,
Rui Barradas
Em 14-04-2013 11:31, Nicolás Sánchez escreveu:
Good morning all.
I am doi
HI,
Hi,
Try this:
str(PriceList)
'data.frame': 161 obs. of 3 variables:
$ Price : num 0 8.18 8.27 10.42 10.5 ...
$ Size : int 664640 440 407 180 690 851 190 480 720 74 ...
$ bandNum:List of 161
PriceList1<-within(PriceList,{bandNum<-as.numeric(bandNum)})
str(PriceList1)
'data
On Oct 17, 2012, at 1:45 PM, jcrosbie wrote:
The aggregate function for some reason will now work for me.
The error I'm getting is:
"Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?"
You have managed to create a slightly pathological dataframe:
>
If you read the error message carefully and look at the data you
included with dput() (for which I thank you!), you'll see that bandNum
is a list, not a vector, just as the error message told you.
I'm not sure how you created or imported your data frame, but
something appears to have not worked th
On Apr 26, 2012, at 9:29 AM, Neil Davis wrote:
Hi,
I have a data.frame which contains timeseries from several different
locations, which I want to compare against each other for example
calculating RMSE, or normalized mean bias of each location against
the others. An example of this is t
On Dec 21, 2011, at 18:22 , Mary Kindall wrote:
> Hi Jim
>
> Thanks for reply but this is not working. I think I am missing something
> over here.
Yes, the data.table() bit. It's not going to work with data frames.
>
> 1> x <- cbind(c(1,2,2,2,3,4), c('a','b', 'c','d','e','f'))
> 1> colnames(x
Hi Jim
Thanks for reply but this is not working. I think I am missing something
over here.
1> x <- cbind(c(1,2,2,2,3,4), c('a','b', 'c','d','e','f'))
1> colnames(x) = c('param', 'case1')
1> x = as.data.frame(x)
1> x
param case1
1 1 a
2 2 b
3 2 c
4 2 d
5 3
You were using the wrong syntax; it should be:
x[
, list(case1 = paste(case1, collapse = ','))
, by = param
]
Notice that you do not use the "x$" on the names within the data.table
statement.
On Wed, Dec 21, 2011 at 12:22 PM, Mary Kindall wrote:
> Hi Jim
>
> Thanks for reply but t
On Dec 21, 2011, at 11:31 AM, jim holtman wrote:
Here is an example using 'data.table'"
x <- read.table(text = "param case1
+ 1 a
+ 2 b
+ 2 c
+ 2 d
+ 3 e
+ 4 f", header = TRUE, as.is = TRUE)
And the a
Here is an example using 'data.table'"
> x <- read.table(text = "param case1
+ 1 a
+ 2 b
+ 2 c
+ 2 d
+ 3 e
+ 4 f", header = TRUE, as.is = TRUE)
> require(data.table)
> x <- data.table(x)
> x[
+ , list( ca
On Oct 5, 2011, at 7:45 PM, Eva Powers wrote:
I have 2 dataframes. "mydata" contains numerical data. "mybys"
contains
information on the "group" each row of the data is in. I wish to
aggregate
each column in mydata using the corresponding column in mybys.
corresponding?
Please see th
Hi:
It's a little tricky to read in a data frame 'by hand' without making
NA a default missing value; you've got to trick it a bit. I'm doing
this inefficiently, but if you have the two 'real' data sets stored in
separate files, read.table() is the way to go since it provides an
option for definin
Hadley,
That's fine; please do. I'm happy to explain it offline where the
documentation or comments in the
code aren't sufficient. It's GPL code so you can take it and improve it, or
depend on it.
Whatever works for you. As long as (of course) you don't stand on it's
shoulders and then
restric
> From: had...@rice.edu
> Date: Mon, 7 Feb 2011 11:00:59 -0600
> To: mdo...@mdowle.plus.com
> CC: r-h...@stat.math.ethz.ch
> Subject: Re: [R] aggregate function - na.action
>
> > Does FAQ 1.8 answer that ok ?
> >
> Does FAQ 1.8 answer that ok ?
> "Ok, I'm starting to see what data.table is about, but why didn't you
> enhance data.frame in R? Why does it have to be a new package?"
> http://datatable.r-forge.r-project.org/datatable-faq.pdf
Kind of. I think there are two sets of features data.table provi
Hi Hadley,
Does FAQ 1.8 answer that ok ?
"Ok, I'm starting to see what data.table is about, but why didn't you
enhance data.frame in R? Why does it have to be a new package?"
http://datatable.r-forge.r-project.org/datatable-faq.pdf
Matthew
"Hadley Wickham" wrote in message
news:AANLkT
On Mon, Feb 7, 2011 at 5:54 AM, Matthew Dowle wrote:
> Looking at the timings by each stage may help :
>
>> system.time(dt <- data.table(dat))
> user system elapsed
> 1.20 0.28 1.48
>> system.time(setkey(dt, x1, x2, x3, x4, x5, x6, x7, x8)) # sort by the
>> 8 columns (one-off)
>
Looking at the timings by each stage may help :
> system.time(dt <- data.table(dat))
user system elapsed
1.200.281.48
> system.time(setkey(dt, x1, x2, x3, x4, x5, x6, x7, x8)) # sort by the
> 8 columns (one-off)
user system elapsed
4.720.945.67
> system.time(
On Feb 6, 2011, at 7:41 PM, Hadley Wickham wrote:
There's definitely something amiss with aggregate() here since
similar
functions from other packages can reproduce your 'control' sum. I
expect
ddply() will have some timing issues because of all the subgrouping
in your
data frame, but data
> There's definitely something amiss with aggregate() here since similar
> functions from other packages can reproduce your 'control' sum. I expect
> ddply() will have some timing issues because of all the subgrouping in your
> data frame, but data.table did very well and the summaryBy() function i
Hi:
There's definitely something amiss with aggregate() here since similar
functions from other packages can reproduce your 'control' sum. I expect
ddply() will have some timing issues because of all the subgrouping in your
data frame, but data.table did very well and the summaryBy() function in t
Try 'data.table' package. It took 3 seconds to aggregate the 500K
levels: Is this what you were after?
> # note the characters are converted to factors that 'data.table' likes
> dat=data.frame(
+x1=sample(c(NA,'m','f'), 2e6, replace=TRUE),
+x2=sample(c(NA, 1:10), 2e6, replace=TRU
By the way, thanks for sending that formula, it's quite thoughtful of you to
send an answer with an actual working line of code!
When I experimented with ddply earlier last week I couldn't figure out the
syntax for a single line aggregation, so it's good to have this example. I
will likely use it
Try to use formula notation and use na.action=na.pass
It is all described in the help(aggregate)
У Няд, 06/02/2011 у 14:54 -0600, Gene Leynes піша:
> On Fri, Feb 4, 2011 at 6:54 PM, Ista Zahn wrote:
>
> > >
> > > However, I don't think you've told us what you're actually trying to
> > > accompl
On Fri, Feb 4, 2011 at 6:54 PM, Ista Zahn wrote:
> >
> > However, I don't think you've told us what you're actually trying to
> > accomplish...
> >
>
I'm trying to aggregate the y value of a big data set which has several x's
and a y.
I'm using an abstracted example for many reasons. Partially,
Ista,
Thank you again.
I had figured that out... and was crafting another message when you replied.
The NAs do come though on the variable that is being aggregated,
However, they do not come through on the categorical variable(s).
The aggregate function must be converting the data frame variabl
Just to be clear:
This works:
> set.seed(100)
> dat=data.frame(
+ x1=sample(c(NA,'m','f'), 100, replace=TRUE),
+ x2=sample(c(NA, 1:10), 100, replace=TRUE),
+ x3=sample(c(NA,letters[1:5]), 100, replace=TRUE),
+ x4=sample(c(NA,T,F), 100, replace=TRUE),
+ y=sam
oops. For clarity, that should have been
sum(ddply(dat, .(x1,x2,x3,x4), function(x){data.frame(y.sum=sum(x$y,
na.rm=TRUE))})$y.sum)
-Ista
On Fri, Feb 4, 2011 at 7:52 PM, Ista Zahn wrote:
> Hi again,
>
> On Fri, Feb 4, 2011 at 7:18 PM, Gene Leynes wrote:
>> Ista,
>>
>> Thank you again.
>>
>> I
Hi again,
On Fri, Feb 4, 2011 at 7:18 PM, Gene Leynes wrote:
> Ista,
>
> Thank you again.
>
> I had figured that out... and was crafting another message when you replied.
>
> The NAs do come though on the variable that is being aggregated,
> However, they do not come through on the categorical va
Hi,
On Fri, Feb 4, 2011 at 6:33 PM, Gene Leynes wrote:
> Thank you both for the thoughtful (and funny) replies.
>
> I agree with both of you that sum is the one picking up aggregate. Although
> I didn't mention it, I did realize that in the first place.
> Also, thank you Phil for pointing out th
Thank you both for the thoughtful (and funny) replies.
I agree with both of you that sum is the one picking up aggregate. Although
I didn't mention it, I did realize that in the first place.
Also, thank you Phil for pointing out that aggregate only accepts a formula
value in more recent versions!
Sorry, I didn't see Phil's reply, which is better than mine anyway.
-Ista
On Fri, Feb 4, 2011 at 5:16 PM, Ista Zahn wrote:
> Hi,
>
> Please see ?na.action
>
> (just kidding!)
>
> So it seems to me the problem is that you are passing na.rm to the sum
> function. So there is no missing data for th
Hi,
Please see ?na.action
(just kidding!)
So it seems to me the problem is that you are passing na.rm to the sum
function. So there is no missing data for the na.action argument to
operate on!
Compare
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.fail)$y)
sum(aggregate(y~x1+x2+x3+x4
Gene -
Let me try to address your concerns one at a time:
Since the formula interface to aggregate was introduced
pretty recently (I think R-2.11.1, but I might be wrong)
so when you try to use it in an R-2.10.1 it won't work.
Now let's take a close look at the help page for aggregate.
The
Try this:
> aggregate(list(Max = df$value), df['id'], max)
id Max
1 11 2.610491
2 22 3.796836
3 33 6.562515
or if using value rather than Max is ok then just:
> aggregate(df['value'], df['id'], max)
idvalue
1 11 2.610491
2 22 3.796836
3 33 6.562515
On Thu, Feb 11, 2010 at 12:18 PM
Thanks a mil, will try that.
-Original Message-
From: Petr PIKAL [mailto:petr.pi...@precheza.cz]
Sent: 23 April 2009 12:18
To: Bronagh Grimes
Cc: r-help@r-project.org
Subject: Odp: [R] Aggregate Function
Try to set scipen in options.
?options
e.g.
options(scipen=12)
Regards
Petr
r-hel
That may have something to do with that you have "empty" groups. In your
example, ALL Hour=0 have Y2=NA. The following example may illustrate the
point. The first 2 aggregate commands perform the function on data that
contain NAs. However, the NAs are not perfectly collinear with any level by
which
Everything was read in the same way, and str(junk1) confirms that they are
the same structure. This is very strange.
## original data:
> str(junk1)
'data.frame': 96 obs. of 3 variables:
$ Hour: int 0 3 5 0 3 5 0 3 5 0 ...
$ Drug: Factor w/ 2 levels "D","P": 2 2 2 1 1 1 2 2 2 1 ...
$ Aldo:
41 matches
Mail list logo