Re: [R] Aggregate weights for a unique set of rows

Rui Barradas Thu, 28 Jun 2012 23:27:09 -0700

Hello,

Would around two orders of magnitude interess?


f1 <- function(Nodes, Weights){
        drop.index <- duplicated(Nodes)
        n.unique <- Nodes[!drop.index, ]
        w.unique <- numeric(length(n.unique[,1]))

        lw <- length(Weights)
        for (i in seq_along(w.unique)){
                index <- as.logical(2 == rowSums(
                        Nodes == matrix(rep(n.unique[i,],lw), byrow=TRUE, 
nrow=lw)))
                w.unique[i] <- sum(Weights[index])
        }
        list(n.unique=n.unique, w.unique=w.unique)
}

f2 <- function(Nodes, Weights){
        rows <- paste(Nodes[,1], Nodes[,2], sep=".")
        w.uniq <- tapply(Weights, rows, sum)
        attributes(w.uniq) <- NULL
        ord <- order(unique(rows))
        list(n.unique=unique(Nodes), w.unique=w.uniq[order(ord)])
}

# Test it
M <- 100  # see text below
n <- 2e5

set.seed(1234)
nd <- matrix(sample(M, n*2, TRUE), n, 2)
ww <- rep(1, n)

t1 <- system.time(r1 <- f1(nd, ww))
t2 <- system.time(r2 <- f2(nd, ww))
identical(r1, r2)

print(rbind(t1=t1, t2=t2, ratio=t1/t2), digits=3)
      user.self sys.self elapsed user.child sys.child
t1       310.41       67  379.07         NA        NA
t2         5.59        0    5.62         NA        NA
ratio     55.53      Inf   67.45         NA        NA

With bigger M the number of pairwise combinations increases and so doesthe number of unique rows. This causes the time taken by f1 to reallyincrease, but f2 scales up rather slowly. The ratio above becomes reallybetter and better.


Hope this helps,

Rui Barradas


Em 28-06-2012 14:06, Weiser, Constantin escreveu:

Hi, all together. I have - a maybe trivial - problem with aggregating a
list of weights.

Here is the problem:
  - At first I have set of nodes (X/Y-coordinates) and associated weights,
where the set
    of nodes is typically not unique
  - I want to get a set of unique nodes and the sum of associated weights

I am grateful for any help


See for example:

        # weights:
        w <- c(1, 1, 1, 1, 1)

        # not unique set of nodes (X/Y-coordinates):
        nodes <- matrix(c(1,2,3,4,5,6,1,2,1,4),ncol=2, byrow=T)


desired Result:

        #nodes
            [,1] [,2]
        [1,]    1    2
        [2,]    3    4
        [3,]    5    6
        [4,]    1    4



        #weights
        2 1 1 1



That is my solution, but it is very slow (typical size of nodes -->
200000x2):

weights <- c(1, 1, 1, 1, 1)
nodes <- matrix(c(1,2,3,4,5,6,1,2,1,4),ncol=2, byrow=T)


## to be replaced by a faster code
        drop.index <- duplicated(nodes)
        n.unique <- nodes[!drop.index, ]
        w.unique <- numeric(length(n.unique[,1]))

        lw <- length(weights)
        for (i in seq_along(w.unique)){
                index <- 
as.logical(2==rowSums(nodes==matrix(rep(n.unique[i,],lw),byrow
= TRUE, nrow=lw)))
                w.unique[i] <- sum(weights[index])
        }
##

n.unique
w.unique





^
|                X
|               /
|              /eiser, Constantin
|             / Gutenberg University of Mainz, Germany
| *    /\    /  Chair of Statistics & Econometrics
|  \  /  \  /   Jakob-Welder-Weg 4, 55128 Mainz
|   \/    \/    House of Law and Economics II, Room 00-116
|               Tel: 0049 6131 39 22715
+--------------------------------------------------------->

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregate weights for a unique set of rows

Reply via email to