Celine wrote on 11/08/2011 02:28:32 AM: > > Thanks for your help, I still have a little problem with this function > because I don't have the same number of line in my two datarame so when I > try to apply the dataframe function, I obtain this response ; that I have a > different number of lines. > > Erreur dans data.frame(train, test[rowz, ]) : > les arguments impliquent des nombres de lignes différents : 50327, 66592 > Do you know how I could solve this problem ? > > Thanks, > > Céline
The knn1() function finds, for each row of test (df1), the closest coordinates in train (df2). rowz <- knn1(train, test, cl) Thus, rowz should have the same number of elements as the test (df1) has rows, and df1 and df2[rowz, ] should have the same numbers of rows. Without the data in hand, I don't know what's going on. I suggest you look at the rowz variable and see if it makes sense. Is it matching up the rows the way it should? Try looking at a subset of the df1 and df2 data within a certain narrow range of Xs and Ys. Also, when replying to the r-help list, it is helpful for other readers to maintain the history of previous posts. Jean --- previous posts --- Celine wrote on 11/07/2011 02:50:55 PM: > > Hi R user, > > I have two dataframe with different variables and coordinates : > X Y sp bio3 bio5 bio6 bio13 bio14 > 1 -70.91667 -45.08333 0 47 194 -27 47 12 > 2 -86.58333 66.25000 0 16 119 -345 42 3 > 3 -62.58333 -17.91667 0 68 334 152 144 28 > 4 -68.91667 -31.25000 0 54 235 -45 25 7 > 5 55.58333 48.41667 0 23 319 -172 23 14 > 6 66.25000 37.75000 0 34 363 -18 49 0 > > and this one : > > X Y LU1 LU2 LU3 LU4 LU5 LU6 LU7 LU8 LU9 LU10 LU11 LU12 LU13 LU14 > LU15 LU16 LU17 LU18 > 1 -36.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0.000000 0 0 0 0 > 2 -36.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0.000000 0 0 0 0 > 3 -35.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 26.085468 0 0 0 0 > 4 -35.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0.000000 0 0 0 0 > 5 -34.5 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 5.267761 0 0 0 0 > 6 -34.0 84 0 0 0 0 0 0 0 0 0 0 0 0 0 > 105.371069 0 0 0 0 > > I wouldlike to add to my first dataframe the value of the LU variables at > the coordinates of the first dataframe. Of course, the coordinates are not > at the same resolution and are different, this is the problem. > I wouldlike to decrease the resolution of the first one because the second > dataframe have a coarser resolution and obtain something like that : > > X Y sp bio3 bio5 bio6 bio13 bio14 LU1 LU2 LU3 LU4 ... > 1 -70.91667 -45.08333 0 47 194 -27 47 12 0 22.08 76.9 > 2 -86.58333 66.25000 0 16 119 -345 42 3 0 22.08 76.9 > 3 -62.58333 -17.91667 0 68 334 152 144 28 0 22.08 76.9 > 4 -68.91667 -31.25000 0 54 235 -45 25 7 0 22.08 76.9 > 5 55.58333 48.41667 0 23 319 -172 23 14 0 22.08 76.9 > 6 66.25000 37.75000 0 34 363 -18 49 0 0 22.08 76.9 > > Do someone know a function or a way to do obtain that ? > > Thanks in advance for the help, > Céline You could use 1-nearest neighbor classification to find the closest set of coordinates in the second data frame (df2) to each row of coordinates in the first data frame (df1). The function knn1() is in the r package "class". For example: library(class) train <- df2[, c("X", "Y")] test <- df1[, c("X", "Y")] cl <- 1:dim(train)[1] rowz <- knn1(train, test, cl) data.frame(df1, df2[rowz, ]) Jean [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.