Have you tried running merged_cut_col$pickts through something that is
less complex? Perhaps:
table(merged_cut_col$pickts)
... to see if there are problems with the "inner" functions? Also I
think the as.numeric might be superfluous, since Dates are really just
integers with some attitude, er, attributes.
--
David.
On Sep 11, 2009, at 4:36 PM, Jason Baucom wrote:
My apologies for bringing up an old topic, but still having some
problems!
I got this code to work, and it was running perfectly fine. I tried
it with a larger data set and it crashed my machine, slowly chewing
up memory until it could not allocate any more for the process. The
following line killed me:
merged_cut_col$pickseq<-
with(merged_cut_col,ave(as.numeric(as.Date(pickts)),cpid,FUN=seq))
So, I thought I'd try it another way, using the transformBy in the
doBy package:
merged_cut_col<-
transformBy(~cpid,data=merged_cut_col,pickseqREDO=seq(cpid))
This too ran for hours until eventually running out of memory. I've
tried it on a beefier machine and I run in to the same problem.
Is there an alternative to these methods that would be less memory/
time intensive? This is a fairly simple routine I'm trying, just
generating sequence numbers based on simple criteria. I'm surprised
it's bringing my computer to its knees. I'm running about 1M rows
now, but doing other operations such as merges or adding new columns/
rows seems fine.
-----Original Message-----
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Thursday, August 27, 2009 12:48 PM
To: Jason Baucom
Cc: Henrique Dallazuanna; r-help@r-project.org; Steven Few
Subject: Re: [R] generating multiple sequences in subsets of data
On Aug 27, 2009, at 11:58 AM, Jason Baucom wrote:
I got this to work. Thanks for the insight! row7 is what I need.
checkLimit <-function(x) x<3
stuff$row6<-checkLimit(stuff$row1)
You don't actually need those intermediate steps:
stuff$row7 <- with(stuff, ave(row1, row2, row1 < 3, FUN = seq))
stuff
row1 row2 row7
1 0 1 1
2 1 1 2
3 2 1 3
4 3 1 1
5 4 1 2
6 5 1 3
7 1 2 1
8 2 2 2
9 3 2 1
10 4 2 2
The expression row1 < 3 gets turned into a logical vector that ave()
is perfectly happy with.
--
David Winsemius
stuff$row7 <- with(stuff, ave(row1,row2, row6, FUN = sequence))
stuff
row1 row2 row3 row4 row5 row6 row7
1 0 1 1 1 1 TRUE 1
2 1 1 2 2 2 TRUE 2
3 2 1 3 3 3 TRUE 3
4 3 1 4 1 4 FALSE 1
5 4 1 5 1 5 FALSE 2
6 5 1 6 1 6 FALSE 3
7 1 2 1 1 1 TRUE 1
8 2 2 2 2 2 TRUE 2
9 3 2 3 1 3 FALSE 1
10 4 2 4 1 4 FALSE 2
Jason
________________________________
From: Henrique Dallazuanna [mailto:www...@gmail.com]
Sent: Thursday, August 27, 2009 11:02 AM
To: Jason Baucom
Cc: r-help@r-project.org; Steven Few
Subject: Re: [R] generating multiple sequences in subsets of data
Try this;
stuff$row3 <- with(stuff, ave(row1, row2, FUN = seq))
I don't understand the fourth column
On Thu, Aug 27, 2009 at 11:55 AM, Jason Baucom
<jason.bau...@ateb.com> wrote:
I'm running into a problem I can't seem to find a solution for. I'm
attempting to add sequences into an existing data set based on
subsets
of the data. I've done this using a for loop with a small subset of
data, but attempting the same process using real data (200k rows) is
taking way too long.
Here is some sample data and my ultimate goal
row1<-c(0,1,2,3,4,5,1,2,3,4)
row2<-c(1,1,1,1,1,1,2,2,2,2)
stuff<-data.frame(row1=row1,row2=row2)
stuff
row1 row2
1 0 1
2 1 1
3 2 1
4 3 1
5 4 1
6 5 1
7 1 2
8 2 2
9 3 2
10 4 2
I need to derive 2 columns. I need a sequence for each unique row2,
and
then I need a sequence that restarts based on a cutoff value for row1
and unique row2. The following table is what is -should- look like
using
a cutoff of 3 for row4
row1 row2 row3 row4
1 0 1 1 1
2 1 1 2 2
3 2 1 3 3
4 3 1 4 1
5 4 1 5 2
6 5 1 6 3
7 1 2 1 1
8 2 2 2 2
9 3 2 3 1
10 4 2 4 2
I need something like row3<-sequence(nrow(unique(stuff$row2))) that
actually works :-) Here is the for loop that functions properly for
row3:
stuff$row3<-c(1)
for (i in 2:nrow(stuff)) { if ( stuff$row2[i] == stuff$row2[i-1]) {
stuff$row3[i] = stuff$row3[i-1]+1}}
Thanks!
Jason Baucom
Ateb, Inc.
919.882.4992 O
919.872.1645 F
www.ateb.com <http://www.ateb.com/>
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.