I would like to determine the geographical distances from a number of addresses
and determine the mean value (the mean distance) from these.
In case the dataframe has only one row, I have found a solution:
```r
# Pakete laden
library(readxl)
library(openxlsx)
library(googleway)
#library(sf)
library(tidyverse)
library(geosphere)
library("ggmap")
#API Key bestimmen
set_key("")
api_key <- ""
register_google(key=api_key)
# Data
df <- data.frame(
V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 München,
Germany",
"07745 Jena, Germany", "10117 Berlin, Germany"),
V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 Planegg,
Germany",
"07743 Jena, Germany", "14195 Berlin, Germany"),
V3 = c("85748 Garching, Germany", "01069 Dresden, Germany", "85748 Garching,
Germany",
NA, "10318 Berlin, Germany"),
V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 München,
Germany",
"07745 Jena, Germany", NA), stringsAsFactors=FALSE
)
#replace NA for geocode-funktion
df[is.na(df)] <- ""
#slice it
df1 <- slice(df, 5:5)
# lon lat Informations
df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()
# to Matrix
mat_df <- as.matrix(df_2)
#dist-mat
dist_mat <- distm(mat_df)
#mean-dist of row 5
mean(dist_mat[lower.tri(dist_mat)])/1000
```
Unfortunately, I fail to implement a function that executes the code for an
entire data set. My current problem is, that the function does not calculate
the distance-averages rowwise, but calculates the average value from all lines
of the data set.
```r
#Funktion
Mean_Dist <- function(df,w,x,y,z) {
# for (row in 1:nrow(df)) {
# dist_mat <- geocode(c(w, x, y, z))
#
# }
df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen aus
Adressen
mat_df <- as.matrix(df) # schreibe diese in eine Matrix
dist_mat <- distm(mat_df)
dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
return(dist_mean)
}
df %>% mutate(lon = Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)
```
Do you have any idea what mistake I made?
to clarify my question: What I'm trying to create a dataframe like this one
(V5):
```r
V1 V2 V3 V4
V5
<chr> <chr> <chr> <chr>
<numeric>
1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805
München, Germany Mean_Dist_row1
2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany 01187
Dresden, Germany Mean_Dist_row2
3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805
München, Germany Mean_Dist_row3
4 07745 Jena, Germany 07743 Jena, Germany 07745 Jena, Germany 07745
Jena, Germany Mean_Dist_row4
5 10117 Berlin, Germany 14195 Berlin, Germany 10318 Berlin, Germany 14476
Potsdam, Germany Mean_Dist_row5
```
eg an average of the distance of each row.
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.