# [R] Average distance in kilometers between subsets of points with ggmap /geosphere

Malte Hückstädt de@dd@t@@c|ent|@t@ @end|ng |rom gm@||@com
Mon Sep 23 09:06:50 CEST 2019

```I would like to determine the geographical distances from a number of addresses and determine the mean value (the mean distance) from these.

In case the dataframe has only one row, I have found a solution:

```r
library(openxlsx)
#library(sf)
library(tidyverse)
library(geosphere)
library("ggmap")

#API Key bestimmen
set_key("")
api_key <- ""

#  Data
df <- data.frame(
V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 München, Germany",
"07745 Jena, Germany",    "10117 Berlin, Germany"),
V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 Planegg, Germany",
"07743 Jena, Germany",    "14195 Berlin, Germany"),
V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748 Garching, Germany",
NA,     "10318 Berlin, Germany"),
V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 München, Germany",
"07745 Jena, Germany", NA), stringsAsFactors=FALSE
)

#replace NA for geocode-funktion
df[is.na(df)] <- ""

#slice it
df1 <- slice(df, 5:5)

#  lon lat Informations
df_2 <- geocode(c(df1\$V1, df1\$V2,df1\$V3, df1\$V4)) %>% na.omit()

# to Matrix
mat_df  <- as.matrix(df_2)

#dist-mat
dist_mat <- distm(mat_df)

#mean-dist of row 5
mean(dist_mat[lower.tri(dist_mat)])/1000
```

Unfortunately, I fail to implement a function that executes the code for an entire data set. My current problem is, that the function does not calculate the distance-averages rowwise, but calculates the average value from all lines of the data set.

```r
#Funktion

Mean_Dist <- function(df,w,x,y,z) {

# for (row in 1:nrow(df)) {
#   dist_mat <- geocode(c(w, x, y, z))
#
# }

df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen aus Adressen

mat_df <- as.matrix(df) # schreibe diese in eine Matrix

dist_mat <- distm(mat_df)

dist_mean <- mean(dist_mat[lower.tri(dist_mat)])

return(dist_mean)
}

df %>%  mutate(lon =  Mean_Dist(df,df\$V1, df\$V2,df\$V3, df\$V4)/1000)

```
Do you have any idea what mistake I made?

to clarify my question: What I'm trying to create a dataframe like this one (V5):

```r
V1                     V2                     V3                      V4                      V5
<chr>                  <chr>                  <chr>                   <chr>                   <numeric>
1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row1
2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany  01187 Dresden, Germany Mean_Dist_row2
3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row3
4 07745 Jena, Germany    07743 Jena, Germany    07745 Jena, Germany     07745 Jena, Germany Mean_Dist_row4
5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany   14476 Potsdam, Germany Mean_Dist_row5
```

eg an average of the distance of each row.
```