Re: [R] replace id by running number

Gabor Grothendieck Sat, 27 Mar 2010 10:05:58 -0700

Here is a comparison of the speed of the solutions so far plus one
based on cumsum. It seems that cumsum was the fastest and is 26x
faster than the slowest solution.


The speed may not be so important here and readability might be key in
which case a good compromise might be match which was still pretty
fast (1.6x the time of cumsum) and is very simple.

> library(rbenchmark)
> set.seed(1)
> y <- sample(10000, 10000, replace = TRUE)
> benchmark(
+ match = match(y, unique(y)),
+ cumsum = cumsum(c(FALSE, y[-1] != y[-length(y)])) + 1,
+ rle = with(rle(y), rep(seq_along(values), lengths)),
+ factor = as.numeric(factor(y, levels = unique(y)))
+ )
    test replications elapsed  relative user.self sys.self user.child sys.child
2 cumsum          100    0.21  1.000000      0.22        0         NA        NA
4 factor          100    5.50 26.190476      5.32        0         NA        NA
1  match          100    0.34  1.619048      0.33        0         NA        NA
3    rle          100    0.81  3.857143      0.81        0         NA        NA
>


On Sat, Mar 27, 2010 at 11:42 AM, sun <[email protected]> wrote:
> Dear all,
>
> I want to replace an (unsorted) id variable in a large dataset by a running
> number without changing the order of the cases.
>
> E.g.,
>
> y <- c(4,4,4,2,45,12,12)
>
> should be replaced by something like
>
> x <- c(1,1,1,2,3,4,4)
>
> Sorry for this simple question  & thank you very much for your help!
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace id by running number

Reply via email to