At this point 3 functions have been suggested and I'll add a 4th: f1 <- function(x)unlist(lapply(unname(split(rep.int(1L,length(x)), x)), cumsum)) f2 <- function(x)unlist(sapply(rle(x)$lengths, function(k) 1:k )) f3 <- function(x)ave(x,x,FUN=seq) f4 <- function(x)ave(seq_along(x), x, FUN=seq_along) You can compare their results with ftest (as long as their results have the same lengths): ftest <- function(x) { data.frame(x, f1=f1(x), f2=f2(x), f3=f3(x), f4=f4(x)) } They all return the same result for the Steven's sample data, which is numeric and in sorted order: x0 <- c(123.45, 123.45, 123.45, 123.45, 234.56, 234.56, 234.56, 234.56, 234.56, 234.56, 234.56, 345.67, 345.67, 345.67, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78) However, f1() gives the wrong answer if x is not sorted: > ftest(c(30,30,30, 20,20)) x f1 f2 f3 f4 1 30 1 1 1 1 2 30 2 2 2 2 3 30 1 3 3 3 4 20 2 1 1 1 5 20 3 2 2 2
f1() and f2() give the wrong answer if the groups are split up in the data > ftest(c(10,10, 8,8,8, 10,10,10)) # 10's not contiguous x f1 f2 f3 f4 1 10 1 1 1 1 2 10 2 2 2 2 3 8 3 1 1 1 4 8 1 2 2 2 5 8 2 3 3 3 6 10 3 1 3 3 7 10 4 2 4 4 8 10 5 3 5 5 (It is not clear what result the OP wants here.) f3() gives the wrong answer if x is not numeric > f3(c("a","a","a", "b","b")) [1] "1" "2" "3" "1" "2" f3() also gives an ominous warning if there is singleton in x (be > f3(c(1,1,1, 11)) [1] 1 2 3 1 Warning message: In `split<-.default`(`*tmp*`, g, value = lapply(split(x, g), FUN)) : number of items to replace is not a multiple of replacement length f2() fails to give an answer if x is a factor > f2(factor(c("x","y","z"))) Error in rle(x) : 'x' must be an atomic vector I think f4 gives the correct result for all those cases. I think all of the above call lapply(split()) at some point and that can use a lot of memory when there are lots of unique values in x. You can use a sort-based algorithm to avoid that problem. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of arun > Sent: Friday, October 11, 2013 6:43 AM > To: Steven Ranney; r-help@r-project.org > Subject: Re: [R] Create sequential vector for values in another column > > > > Also, > > it might be faster to use ?data.table() > library(data.table) > dt1<- data.table(dat1,key='id.name') > dt1[,x:=seq(.N),by='id.name'] > A.K. > > > On , arun <smartpink...@yahoo.com> wrote: > Hi, > Try: > dat1<- > > structure(list(id.name = c(123.45, 123.45, 123.45, 123.45, 234.56, > 234.56, 234.56, 234.56, 234.56, 234.56, 234.56, 345.67, 345.67, > 345.67, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, > 456.78, 456.78)), .Names = "id.name", class = "data.frame", row.names = c(NA, > -23L)) > dat1$x <- with(dat1,ave(id.name,id.name,FUN=seq)) > A.K. > > > > On Friday, October 11, 2013 9:28 AM, Steven Ranney <steven.ran...@gmail.com> > wrote: > Hello all - > > I have an example column in a dataFrame > > id.name > 123.45 > 123.45 > 123.45 > 123.45 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 345.67 > 345.67 > 345.67 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > ... > [truncated] > > And I'd like to create a second vector of sequential values (i.e., 1:N) for > each unique id.name value. In other words, I need > > id.name x > 123.45 1 > 123.45 2 > 123.45 3 > 123.45 4 > 234.56 1 > 234.56 2 > 234.56 3 > 234.56 4 > 234.56 5 > 234.56 6 > 234.56 7 > 345.67 1 > 345.67 2 > 345.67 3 > 456.78 1 > 456.78 2 > 456.78 3 > 456.78 4 > 456.78 5 > 456.78 6 > 456.78 7 > 456.78 8 > 456.78 9 > > The number of unique id.name values is different; for some values, nrow() > may be 42 and for others it may be 36, etc. > > The only way I could think of to do this is with two nested for loops. I > tried it but because this data set is so large (nrow = 112,679 with 2,161 > unique values of id.name), it took several hours to run. > > Is there an easier way to create this vector? I'd appreciate your thoughts. > > Thanks - > > SR > Steven H. Ranney > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.