Hello everyone, I've been advised to use merge to extract information from two data.frames with a number of common columns, but I cannot get a grasp on how it sorts the result. With sort=FALSE, I would expect it to give the result back sorted exactly as the input was but it seems it is not always the case, especially when there are repeats in the input.
For example: > a = data.frame(field1=c(1,1,2,2),field2=c(1:2,1:2),var1=runif(4)) > b = data.frame(field1=c(2,2,1,1),field2=c(1,2,2,1),var2=runif(4)) > a field1 field2 var1 1 1 1 0.8327855 2 1 2 0.4309419 3 2 1 0.5134574 4 2 2 0.8063110 > b field1 field2 var2 1 2 1 0.2739025 2 2 2 0.5147113 3 1 2 0.2958369 4 1 1 0.3703116 So b is in an irregular order, if I then merge: > merge(b,a) field1 field2 var2 var1 1 1 1 0.3703116 0.8327855 2 1 2 0.2958369 0.4309419 3 2 1 0.2739025 0.5134574 4 2 2 0.5147113 0.8063110 in that case the result is sorted, as expected. If i merge it without sorting: > merge(b,a,sort=F) field1 field2 var2 var1 1 2 1 0.2739025 0.5134574 2 2 2 0.5147113 0.8063110 3 1 2 0.2958369 0.4309419 4 1 1 0.3703116 0.8327855 it retains the order in b, which is what I want. However if I now add a repeated row to b > b = rbind(b,b[1,]) > b field1 field2 var2 1 2 1 0.2739025 2 2 2 0.5147113 3 1 2 0.2958369 4 1 1 0.3703116 5 2 1 0.2739025 and merge it, without sorting > merge(b,a,sort=F) field1 field2 var2 var1 1 2 1 0.2739025 0.5134574 2 2 1 0.2739025 0.5134574 3 2 2 0.5147113 0.8063110 4 1 2 0.2958369 0.4309419 5 1 1 0.3703116 0.8327855 the result is still somehow sorted according to the order of b. I would have expected the output to be: merge(b,a,sort=F) field1 field2 var2 var1 1 2 1 0.2739025 0.5134574 2 2 2 0.5147113 0.8063110 3 1 2 0.2958369 0.4309419 4 1 1 0.3703116 0.8327855 5 2 1 0.2739025 0.5134574 Is it possible to get this output (another function similar to merge)? What is the overall reason (if someone knows it) for the current behaviour of merge? Thanks in advance. PS: code a = data.frame(field1=c(1,1,2,2),field2=c(1:2,1:2),var1=runif(4)) b = data.frame(field1=c(2,2,1,1),field2=c(1,2,2,1),var2=runif(4)) a b merge(b,a) merge(b,a,sort=F) b = rbind(b,b[1,]) b merge(b,a,sort=F) JiHO --- http://jo.irisson.free.fr/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.