Hi
above tapply and aggregate, split *apply could be used)
sapply(with(df, split(z, y)), mean)
Cheers
Petr
> -Original Message-
> From: R-help On Behalf Of Luigi Marongiu
> Sent: Wednesday, November 17, 2021 2:21 PM
> To: r-help
> Subject: [R] vectorization of loops
Have a look at the base functions tapply and aggregate.
For example see:
-
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#The-function-tapply_0028_0029-and-ragged-arrays
,
- https://online.stat.psu.edu/stat484/lesson/9/9.2,
- or ?tapply and ?aggregate.
Also your current code se
If I follow what you are trying to do, you want the mean of z for each value of
y.
tapply(df$z, df$y, mean)
> On Nov 17, 2021, at 8:20 AM, Luigi Marongiu wrote:
>
> Hello,
> I have a dataframe with 3 variables. I want to loop through it to get
> the mean value of the variable `z`, as follows:
Hello,
I have a dataframe with 3 variables. I want to loop through it to get
the mean value of the variable `z`, as follows:
```
df = data.frame(x = c(rep(1,5), rep(2,5), rep(3,5)),
y = rep(letters[1:5],3),
z = rnorm(15),
stringsAsFactors = FALSE)
m = vector()
for (i in unique(df$y)) {
s = df[df$y
I think you answered your own question. For loops are not a boogeyman... poor
memory management is.
Algorithms that are sensitive to evaluation sequence are often not very
re-usable, and certainly not parallelizable. If you have a specific algorithm
in mind, there may be some advice we can give
You are mistaken. apply() is *not* vectorized. It is a disguised loop.
For true vectorization at the C level, the answer must be no, as the
whole point is to treat the argument as a whole object and hide the
iterative details.
However, as you indicated, you can always manually randomize the
index
nBuyMat <- data.frame(matrix(rnorm(28), 7, 4))
nBuyMat
nBuy <- nrow(nBuyMat)
sample(1:nBuy, nBuy, replace=FALSE)
sample(1:nBuy)
sample(nBuy)
?sample
apply(nBuyMat[sample(1:nBuy,nBuy, replace=FALSE),], 1, function(x) sum(x))
apply(nBuyMat[sample(nBuy),], 1, function(x) sum(x))
The defaults for sa
Is there a way to use vectorization where the elements are evaluated in a
random order?
For instance, if the code is to be run on each row in a matrix of length nBuy
the following will do the job
for (b in sample(1:nBuy,nBuy, replace=FALSE)){
}
but
apply(nBuyMat, 1, function(x))
will be run
Please don't post in HTML... you may not recognize it, but the receiving
end does not necessarily (and in this case did not) look like the sending
end, and the cleanup can impede answers you are hoping to get.
In many cases, loops can be vectorized. However, near as I can tell this
is an exam
Great, many thanks for your help Jeff.
Apologies for the HTML format, I'll be more careful next time.
Arnaud
On 08/12/2014 08:25, Jeff Newmiller wrote:
Please don't post in HTML... you may not recognize it, but the
receiving end does not necessarily (and in this case did not) look
like the send
Hello
I use R to run a simple model of rainfall interception by vegetation:
rainfall falls on vegetation, some is retained by the vegetation (part of
which can evaporate), the rest falls on the ground (quite crude but very
similar to those used in SWAT or MikeSHE, for the hydrologists among you).
I
Oh wow, I guess I get it!
Thank you. It is pretty tricky but I saw that it works very fast.
On Wed, Jan 29, 2014 at 9:31 PM, Duncan Murdoch wrote:
> On 14-01-29 6:41 AM, Bill wrote:
>
>> Hi. I saw this example and I cannot begin to figure out how it works. Can
>> anyone give me an idea on this?
On 14-01-29 6:41 AM, Bill wrote:
Hi. I saw this example and I cannot begin to figure out how it works. Can
anyone give me an idea on this?
n = 9e6
df = data.frame(values = rnorm(n),
ID = rep(LETTERS[1:3], each = n/3),
stringsAsFactors = FALSE)
head(df)
m: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Bill
> Sent: Wednesday, January 29, 2014 12:41 PM
> To: r-help@r-project.org
> Subject: [R] vectorization
>
> Hi. I saw this example and I cannot begin to figure out how it works.
> Can anyone
Hi. I saw this example and I cannot begin to figure out how it works. Can
anyone give me an idea on this?
n = 9e6
df = data.frame(values = rnorm(n),
ID = rep(LETTERS[1:3], each = n/3),
stringsAsFactors = FALSE)
> head(df)
values ID
1 -0.7355823 A
2 -0.4729925
On Dec 27, 2012, at 12:38 PM, Sam Steingold wrote:
I have the following code:
--8<---cut here---start->8---
d <- rep(10,10)
for (i in 1:100) {
a <- sample.int(length(d), size = 2)
if (d[a[1]] >= 1) {
d[a[1]] <- d[a[1]] - 1
d[a[2]] <- d[a[2]] + 1
}
}
In current versions of R the apply functions do not gain much (if any) in
speed over a well written for loop (the for loops are much more efficient
than they used to be).
Using global variables could actually slow things down a little for what
you are doing, if you use `<<-` then it has to search
You can use environments. Have a look at this this discussion.
http://stackoverflow.com/questions/7439110/what-is-the-difference-between-parent-frame-and-parent-env-in-r-how-do-they
On 27 December 2012 21:38, Sam Steingold wrote:
> I have the following code:
>
> --8<---cut here--
At Thu, 27 Dec 2012 15:38:08 -0500,
Sam Steingold wrote:
> so,
> 1. is there a way for a function to modify a global variable?
Use <<- instead of <-.
> 2. how would you vectorize this loop?
This is hard. Your function has a feedback loop: an iteration depends
on the previous iteration's result.
I have the following code:
--8<---cut here---start->8---
d <- rep(10,10)
for (i in 1:100) {
a <- sample.int(length(d), size = 2)
if (d[a[1]] >= 1) {
d[a[1]] <- d[a[1]] - 1
d[a[2]] <- d[a[2]] + 1
}
}
--8<---cut here---end
: William Dunlap
To: Guillaume2883 ; "r-help@r-project.org"
Cc:
Sent: Friday, August 10, 2012 8:02 PM
Subject: Re: [R] vectorization condition counting
Your sum(tag_id==tag_id[i])==1, meaning tag_id[i] is the only entry with its
value, may be vectorized by the sneaky idiom
!(duplicate
--Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf
> Of Guillaume2883
> Sent: Friday, August 10, 2012 3:47 PM
> To: r-help@r-project.org
> Subject: [R] vectorization condition counting
>
> Hi all,
>
> I am wo
Hi all,
I am working on a really big dataset and I would like to vectorize a
condition in a if loop to improve speed.
the original loop with the condition is currently writen as follow:
if(sum(as.integer(tags$tag_id==tags$tag_id[i]))==1&tags$lgth[i]<300){
tags$stage[i]<-"J"
}
On Jul 2, 2012, at 5:16 PM, dlv04c wrote:
The code is in the original post, but here it is again:
No code here or in original posting to rhelp. You are under the
delusion that Nabble is R-help. It is not.
--
View this message in context:
http://r.789695.n4.nabble.com/vectorization-with
The code is in the original post, but here it is again:
Thanks,
Dan
--
View this message in context:
http://r.789695.n4.nabble.com/vectorization-with-subset-tp4635156p4635208.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@
On Jul 2, 2012, at 12:15 PM, dlv04c wrote:
Hello,
I have a data frame (68,000 rows) of scores (V4) for a series of
[genomic]
coordinates ranges (V2 to V3).
I also have a data frame (1.2 million rows) of single [genomic]
coordinates.
For each genomic coordinate (in coord), I would l
Hello,
I have a data frame (68,000 rows) of scores (V4) for a series of [genomic]
coordinates ranges (V2 to V3).
I also have a data frame (1.2 million rows) of single [genomic] coordinates.
For each genomic coordinate (in coord), I would like to determine the
average of all scores whose ge
Inline below
On Sun, Dec 4, 2011 at 10:29 AM, Costas Vorlow wrote:
> Dear Bert,
>
> You are right (obviously).
>
> Apologies for any inconvenience caused. I thought my problem was simplistic
> with a very obvious answer which eluded me.
>
> As per your justified questions :
>
> 2: Answer is "all
Dear Bert,
You are right (obviously).
Apologies for any inconvenience caused. I thought my problem was
simplistic with a very obvious answer which eluded me.
As per your justified questions :
2: Answer is "all",
hence:
3. would be include overlapping set (I guess) but this does not matter fo
Costas: (and thanks for giving us your name)
which(x == 1)
gives you the indices where x is 1 (up to floating point equality --
you did not specify whether your x values are integers or calculated
as floating point, and that certainly makes a difference). You can
then use simple indexing to get t
On Sun, Dec 4, 2011 at 10:18 AM, Costas Vorlow wrote:
> Hello,
>
> I am having problems vectorizing the following (i/o using a for/next/while
> loop):
>
> I have 2 sequences such as:
>
> x, y
> 1, 30
> 2, -40
> 0, 50
> 0, 25
> 1, -5
> 2, -10
> 1, 5
> 0, 40
>
> etc etc
>
> The first sequence (x) ta
Thanks Uwe.
What happens if these are zoo (or time series) sequences/dataframes?
\
I think your solution would apply as well, no?
Thanks again & best wishes,
Costas
2011/12/4 Uwe Ligges
>
>
> On 04.12.2011 16:18, Costas Vorlow wrote:
>
>> Hello,
>>
>> I am having problems vectorizing the follo
On 04.12.2011 16:18, Costas Vorlow wrote:
Hello,
I am having problems vectorizing the following (i/o using a for/next/while
loop):
I have 2 sequences such as:
x, y
1, 30
2, -40
0, 50
0, 25
1, -5
2, -10
1, 5
0, 40
etc etc
The first sequence (x) takes integer numbers only: 0, 1, 2
The sequen
Hello,
I am having problems vectorizing the following (i/o using a for/next/while
loop):
I have 2 sequences such as:
x, y
1, 30
2, -40
0, 50
0, 25
1, -5
2, -10
1, 5
0, 40
etc etc
The first sequence (x) takes integer numbers only: 0, 1, 2
The sequence y can be anything...
I want to be able to
On Sun, Jan 23, 2011 at 07:29:16PM -0800, eric wrote:
>
> Is there a way to vectorize this loop or a smarter way to do it ?
>
> y
> [1] 0.003990746 -0.037664639 0.005397999 0.010415496 0.003500676
> [6] 0.001691775 0.008170774 0.011961998 -0.016879531 0.007284486
> [11] -0.015083581 -0.
Is there a way to vectorize this loop or a smarter way to do it ?
y
[1] 0.003990746 -0.037664639 0.005397999 0.010415496 0.003500676
[6] 0.001691775 0.008170774 0.011961998 -0.016879531 0.007284486
[11] -0.015083581 -0.006645958 -0.013153103 0.028148639 -0.005724317
[16] -0.027408025
On Fri, Jan 21, 2011 at 10:10 PM, Mingo wrote:
> Hello, I am new to R (coming from Perl) and have what is, at least at this
> point, a philosophical question and a request for comment on some basic
> code. As I understand it - R emphasizes ,or at least supports, the
> functional programming model.
Hello, I am new to R (coming from Perl) and have what is, at least at this
point, a philosophical question and a request for comment on some basic
code. As I understand it - R emphasizes ,or at least supports, the
functional programming model. I've come across some code that was markedly
absent in
Dear Carlos,
thanks for your support. Patrick Burns gave me a hint, which is in the
end very similar to your proposal. Now the script is roughly 25 times
faster.
Here is the code (I implemented as well an in size not increasing
vector 'summ.dist<-rep(0,val.x.c.n)'):
KEN.STO<-function(val.n
Dear Patrick,
thanks for the very helpful response. I can calculate now 25 times
faster.
I use the 'k' from the outer-most loop only indirectly. It gives a
maximal number of repetitions of the whole script until following
command applies
'if(length(val.x.c)>=val.x.c.n)break'.
The reason w
Hello,
I believe that your bottleneck lies at this piece of code:
sum<-c();
for(j in 1:length(val)){
sum[j]<-euc[rownames(start.b)[i],val[j]]
}
In order to speed up your code, there are two alternatives:
1) Try to reorder the euc matrix so that the sum vector corresponds to
(part of) a
You are definitely in Circle 2 of the R Inferno.
Growing objects is suboptimal, although your
objects are small so this probably isn't taking
too much time.
There is no need for the inner-most loop:
sum.dist[i] <- min(euc[rownames(start.b)[i],val] )
Maybe I'm blind, but I don't see where 'k' c
Dear R-programmer,
I wrote an adapted implementation of the Kennard-Stone algorithm for
sample selection of multivariate data (R 2.7.1 under MacBook Pro,
Processor 2.2 GHz Intel Core 2 Duo, Memory 2 GB 667 MHZ DDR2 SDRAM).
I used for the heart of the script three embedded loops. This makes it
Frank said:
> > This piece of code works, but it is very slow. We were wondering if
it's
> at
> > all possible to somehow vectorize this function. Any help would be
> greatly
> > appreciated.
Richie said:
> You can save a substantial time by calling as.matrix before the loop
Patrick said:
> On
> I've sent this question 2 days ago and got response from Sarah. Thanks
for
> that. But unfortunately, it did not really solve our problem. The main
issue
> is that we want to use our own (manipulated) covariance matrix in the
> calculation of the mahalanobis distance. Does anyone know how to
v
One thing that would speed it up is if you
inverted 'covmat' once and then used
'inverted=TRUE' in the call to 'mahalanobis'.
Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Frank Hedler wrote:
Dear all,
I'v
Dear all,
I've sent this question 2 days ago and got response from Sarah. Thanks for
that. But unfortunately, it did not really solve our problem. The main issue
is that we want to use our own (manipulated) covariance matrix in the
calculation of the mahalanobis distance. Does anyone know how to ve
Dear all,we just realized something. Sarah's distance function - indeed -
calculates mahalanobis distance very well. However, it uses the
observed variance-covariance matrix by default.
What we actually need (sorry for not stating it clearly in to be able to
specify which variance-covariance matrix
Hi Frank,
If the way distance() calculates the Mahalanobis distance meets your
needs other than the covariance specification, you can tweak that
_very_ easily. If you use fix(distance) at the command line, you can
edit the source.
change the first line to:
function (x, method = "euclidean", icov)
distance() from the ecodist package will calculate Mahalanobis distances.
Sarah
--
Sarah Goslee
http://www.functionaldiversity.org
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:
Dear all,
We have a data frame x with n people as rows and k variables as columns.
Now, for each person (i.e., each row) we want to calculate a distance
between him/her and EACH other person in x. In other words, we want to
create a n x n matrix with distances (with zeros in the diagonal).
Howeve
pital-initial.capital)^2 +
(1-P)*(-initial.capital)^2, we can compute E(D).
Regards,
Moshe.
--- On Fri, 15/8/08, jose romero <[EMAIL PROTECTED]> wrote:
> From: jose romero <[EMAIL PROTECTED]>
> Subject: [R] Vectorization of duration of the game in the gambler ruin's
> pro
Hey fellas:
In the context of the gambler's ruin problem, the following R code obtains the
mean duration of the game, in turns:
# total.capital is a constant, an arbitrary positive integer
# initial.capital is a constant, an arbitrary positive integer between, and not
including
# 0 and total.ca
"Sergey Goriatchev" <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]:
> I have the code for the bivariate Gaussian copula. It is written
> with for-loops, it works, but I wonder if there is a way to
> vectorize the function.
> I don't see how outer() can be used in this case, but maybe one can
I have the code for the bivariate Gaussian copula. It is written with
for-loops, it works, but I wonder if there is a way to vectorize the
function.
I don't see how outer() can be used in this case, but maybe one can
use mapply() or Vectorize() in some way? Could anyone help me, please?
## Density
Let x be the input vector and cx be the cumulative running sum of it.
Then seq_along(cx) - match(cx, cx) gives increasing sequences
starting at 0 and for those after the leading zeros we start them
at 1 by adding cummax(x).
x <- c(0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0) # input
cx <- cumsum(x)
Hi,
I cannot find a 'vectorized' solution to this 'for loop' kind of problem.
Do you see a vectorized, fast-running solution?
Objective:
Take the value of X at each timepoint and calculate the corresponding value
of Y. Leading 0's and all 1's for X are assigned to Y; otherwise Y is
incremented b
57 matches
Mail list logo