Please try this
## Import data
id1<-c(4,17,9,1,1,1,3,3,6,15,1,1,1,1,3,3,3,3,4,4,4,5,5,12,9,9,10,10)
id2<-c(8,18,10,3,6,7,6,7,7,16,4,5,12,18,4,5,12,18,5,12,18,12,18,18,15,16,15,16)
id<-data.frame(id1 = id1, id2 = id2)
## Create same structure table
id <- id0 <- unique(id)
leng <- nrow(id)
n <- 0
Maybe something like the following will get you started:
library("igraph")
g <- graph.data.frame(id, directed=FALSE)
neighborhood(g, +Inf)
There is perhaps a more efficient way, but I hope this helps a little.
Allan.
On 03/06/10 14:14, Epi-schnier wrote:
Colleagues,
I am trying to de-dupli
Colleagues,
I am trying to de-duplicate a large (long) database (approx 1mil records) of
diagnostic tests. Individuals in the database can have up-to 25
observations, but most will have only one. IDs for de-duplication (names,
sex, lab number...) are patchy. In a first step, I am using Andreas B
3 matches
Mail list logo