In a previous post I demonstrated how one can assign customers to the nearest warehouse using a well-known clustering algorithm. In this post I will demonstrate how a single warehouse location problem can be approached by applying center of mass calculation in R. The idea is to locate the warehouse in right in the average center of demand, taking into account spatial distribution of customer demand.

Below, I define the center of mass function. It takes into account x and y coordinates. I.e. we simplify the problem by neglecting the vertical z-axis of our spatial location problem. w stands for the weights of the respective coordinates. This could e.g. be the demand by location.

```
center_of_mass <- function(x,y,w){
c(crossprod(x,w)/sum(w),crossprod(y,w)/sum(w))
}
```

Next, I create a dataframe representing customer demand. I apply latitude and longitude coordinates. Latitude value range is from -90 to +90. Longitude value range is from -180 to +180. I create 40 customers with random location and demand, somewhere within the coordinate-system.

```
customer_df <- as.data.frame(matrix(nrow=40,ncol=3))
colnames(customer_df) <- c("lat","long","demand")
customer_df$lat <- runif(n=40,min=-90,max=90)
customer_df$long <- runif(n=40,min=-180,max=180)
customer_df$demand <- sample(x=1:100,size=40,replace=TRUE)
```

Below I print the header of that dataframe:

`head(customer_df)`

```
## lat long demand
## 1 -46.40781 43.13533 4
## 2 -29.06806 98.97764 72
## 3 -75.84997 127.47619 44
## 4 -54.37377 55.16857 66
## 5 71.67371 178.98597 21
## 6 -34.03587 -42.88747 100
```

Using that dataset I can calculate the center of mass:

`center_of_mass(customer_df$lat,customer_df$long,customer_df$demand)`

`## [1] -2.80068 12.49750`

To assess this result, I plot the customer demand using ggplot2:

```
library(ggplot2)
joint_df <- rbind(customer_df,c(center_of_mass(customer_df$lat,customer_df$long,customer_df$demand),50))
joint_df$type <- c(rep(x="customer",times=40),"warehouse")
ggplot(data=joint_df) + geom_point(mapping=aes(x=lat,y=long,size=demand,color=type))
```

The center of mass method can be well combined with heatmap visualisation of spatial customer demand distribution. For this, you can read my blog post on how to create heatmaps in R, using the Leaflet package: Leaflet heatmaps in R

I also use other packages to visualize spatial distribution of customer demand, including deckgl: Using deckgl in R for spatial data visualisation

Data scientist focusing on simulation, optimization and modeling in R, SQL, VBA and Python

Hi Linnart,

While this is very useful, I was wondering could you also show how would we calculate the same using optimization (like as Excel Solver) but via R only.

I understand that the difference would be small but I was just curious if it is possible to do that with R.

Again thanks for this.

Modeling this as a mathematical program is pretty straight forward. You need an objective, which is the euclidean distance between all customers or suppliers to the chosen warehouse location. The optimization variables are the coordinates of the warehouse location.

Thanks Linnart. To be honest, I do not know how to run even a fairly simple optimization using R.

However, I did find another useful package in R called “Gmedian” which finds the centre of minimum distance, only drawback is that it does not consider demand.

However, that can be theoretically overcome by creating demand no. of multiples/copies of each customer, not sure how practical it would be though.

Thanks again. Your content is very useful and informative.

Hi Ibrahim.

You can use lpSolve in R. You can send me the data and can develop a template for you that fits your problem and data. Should only take 1-2 consultation hours. If interested let me know and contact me via the contact form or via LinkedIn.

Regarding geometric median, which you have pointed out, it is true that it will result in a different solution compared to center of mass. But, the center of mass can be calculated easily and at very low cost. The geometric median can only be derived numerically. And this means, the cost is higher. Also, the center of mass and the geometric median are both based on strong assumptions and are so to say both “wrong”. They are, however, good enough when used in the right context.

The center of mass is the center of a heatmap, when e.g. visualizing demand or supply distribution spatially. It is easy to understand and can be a good start point for searching the optimal location of e.g. a warehouse. But, it is only a start point. The cost optimum depends on many more complex factors. Examples:

– parcel shipment costs are zone based, not linear as assumed by e.g. center of mass calculation

– sea freight costs FCL and LCL depend mostly on the port, not on the distance

– air freight depends much on the airport too

– intermodal freight depends mostly on the availability of rail connections (rail is cheaper than trucking)

– labour costs depend on the region / country

– availbility of warehouse facilities for rent, and the cost of renting them depends very much on the region as well

This is why focusing too much on e.g. a geometric median vs. center of mass discussion is the wrong approach. Do not worry too much about whether you choose geometric median or center of mass. Pick one of them and get started. Focus more on the additional factors that I listed abov.