>> > install.packages("profr")
>> Warning message:
>> package 'profr' is not available
>
> I selected a different mirror in place of the Iowa one and it
> worked. Odd, I just assumed all the same packages are available
> on all mirrors.
The Iowa mirror is rather out of date as the guy who was loo
On Fri, 6 Jun 2008, Daniel Folkinshteyn wrote:
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
That should at least help you see where the slow bits are.
Hadley
so profiling reveals that '[.data.frame' and '[[.data.fram
Esmail Bonakdarian wrote:
hadley wickham wrote:
Hi,
I tried this suggestion as I am curious about bottlenecks in my own
R code ...
Why not try profiling? The profr package provides an alternative
display that I find more helpful than the default tools:
install.packages("profr")
> inst
hadley wickham wrote:
Hi,
I tried this suggestion as I am curious about bottlenecks in my own
R code ...
Why not try profiling? The profr package provides an alternative
display that I find more helpful than the default tools:
install.packages("profr")
> install.packages("profr")
Warnin
.)
H.
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel
Folkinshteyn
Sent: Friday, June 06, 2008 4:35 PM
To: hadley wickham
Cc: r-help@r-project.org; Patrick Burns
Subject: Re: [R] Improving data processing efficiency
> install.packages("prof
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
That should at least help you see where the slow bits are.
Hadley
so profiling reveals that '[.data.frame' and '[[.data.frame' and '[' are
the biggest timesuckers...
i suppose
on 06/06/2008 06:55 PM hadley wickham said the following:
Why not try profiling? The profr package provides an alternative
display that I find more helpful than the default tools:
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
thanks for the suggestions! I'll play with this over the weekend and see
what comes out. :)
on 06/06/2008 06:48 PM Don MacQueen said the following:
In a case like this, if you can possibly work with matrices instead of
data frames, you might get significant speedup.
(More accurately, I have had
On Fri, Jun 6, 2008 at 5:10 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
> Hmm... ok... so i ran the code twice - once with a preallocated result,
> assigning rows to it, and once with a nrow=0 result, rbinding rows to it,
> for the first 20 quarters. There was no speedup. In fact, running wi
In a case like this, if you can possibly work with matrices instead
of data frames, you might get significant speedup.
(More accurately, I have had situations where I obtained speed up by
working with matrices instead of dataframes.)
Even if you have to code character columns as numeric, it can
Hmm... ok... so i ran the code twice - once with a preallocated result,
assigning rows to it, and once with a nrow=0 result, rbinding rows to
it, for the first 20 quarters. There was no speedup. In fact, running
with a preallocated result matrix was slower than rbinding to the matrix:
for prea
> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
> Sent: Friday, June 06, 2008 12:33 PM
> To: Greg Snow
> Cc: Patrick Burns; Daniel Folkinshteyn; r-help@r-project.org
> Subject: Re: [R] Improving data processing efficiency
>
> On Fri, Jun
TECTED] On Behalf Of Patrick Burns
Sent: Friday, June 06, 2008 12:04 PM
To: Daniel Folkinshteyn
Cc: r-help@r-project.org
Subject: Re: [R] Improving data processing efficiency
That is going to be situation dependent, but if you have a
reasonable upper bound, then that will be much easier and not
fa
c: r-help@r-project.org
>> Subject: Re: [R] Improving data processing efficiency
>>
>> That is going to be situation dependent, but if you have a
>> reasonable upper bound, then that will be much easier and not
>> far from optimal.
>>
>> If you pick the possib
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Burns
> Sent: Friday, June 06, 2008 12:04 PM
> To: Daniel Folkinshteyn
> Cc: r-help@r-project.org
> Subject: Re: [R] Improving data processing efficiency
>
> Th
Cool, I do have an upper bound, so I'll try it and how much of a
speedboost it gives me. Thanks for the suggestion!
on 06/06/2008 02:03 PM Patrick Burns said the following:
That is going to be situation dependent, but if you
have a reasonable upper bound, then that will be
much easier and not f
That is going to be situation dependent, but if you
have a reasonable upper bound, then that will be
much easier and not far from optimal.
If you pick the possibly too small route, then increasing
the size in largish junks is much better than adding
a row at a time.
Pat
Daniel Folkinshteyn wrot
Ok, sorry about the zip, then. :) Thanks for taking the trouble to clue
me in as to the best posting procedure!
well, here's a dput-ed version of the small data subset you can use for
testing. below that, an updated version of the function, with extra
explanatory comments, and producing an ext
just in case, uploaded it to the server, you can get the zip file i
mentioned here:
http://astro.temple.edu/~dfolkins/helplistfiles.zip
on 06/06/2008 01:25 PM Daniel Folkinshteyn said the following:
i thought since the function code (which i provided in full) was pretty
short, it would be reaso
I think the posting guide may not be clear enough and have suggested that
it be clarified. Hopefully this better communicates what is required and why
in a shorter amount of space:
https://stat.ethz.ch/pipermail/r-devel/2008-June/049891.html
On Fri, Jun 6, 2008 at 1:25 PM, Daniel Folkinshteyn <
thanks for the tip! i'll try that and see how big of a difference that
makes... if i am not sure what exactly the size will be, am i better off
making it larger, and then later stripping off the blank rows, or making
it smaller, and appending the missing rows?
on 06/06/2008 11:44 AM Patrick Bu
i thought since the function code (which i provided in full) was pretty
short, it would be reasonably easy to just read the code and see what
it's doing.
but ok, so... i am attaching a zip file, with a small sample of the data
set (tab delimited), and the function code, in a zip file (posting
That is the last line of every message to r-help.
On Fri, Jun 6, 2008 at 12:05 PM, Gabor Grothendieck
<[EMAIL PROTECTED]> wrote:
> Its summarized in the last line to r-help. Note reproducible and
> minimal.
>
> On Fri, Jun 6, 2008 at 12:03 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]>
> wrote:
>>
Its summarized in the last line to r-help. Note reproducible and
minimal.
On Fri, Jun 6, 2008 at 12:03 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
> i did! what did i miss?
>
> on 06/06/2008 11:45 AM Gabor Grothendieck said the following:
>>
>> Try reading the posting guide before posting.
i did! what did i miss?
on 06/06/2008 11:45 AM Gabor Grothendieck said the following:
Try reading the posting guide before posting.
On Fri, Jun 6, 2008 at 11:12 AM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM Daniel Folki
Try reading the posting guide before posting.
On Fri, Jun 6, 2008 at 11:12 AM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
> Anybody have any thoughts on this? Please? :)
>
> on 06/05/2008 02:09 PM Daniel Folkinshteyn said the following:
>>
>> Hi everyone!
>>
>> I have a question about data pro
One thing that is likely to speed the code significantly
is if you create 'result' to be its final size and then
subscript into it. Something like:
result[i, ] <- bestpeer
(though I'm not sure if 'i' is the proper index).
Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-s
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM Daniel Folkinshteyn said the following:
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had recen
Thanks, I'll take a look at Rprof... but I think what i'm missing is
facility with R idiom to get around the looping, and no amount of
profiling will help me with that :)
also, full working code is provided in my original post (see toward the
bottom).
on 06/05/2008 03:43 PM bartjoosen said t
Maybe you should provide a minimal, working code with data, so that we all
can give it a try.
In the mean time: take a look at the Rprof function to see where your code
can be improved.
Good luck
Bart
Daniel Folkinshteyn-2 wrote:
>
> Hi everyone!
>
> I have a question about data processing e
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had recent IPOs, some have not
(I have a binary flag set). The total dataset size is 700k+ rows.
My goal is this: For
31 matches
Mail list logo