# CAGR-based forecasting, using OICA production data (in R) In this post I provide an example of CAGR-based forecasting, using OICA vehicle production data for Chinese automotive industry.

CAGR is the compounded average growth rate.

If e.g. production output in year 2000 is 1,000,000 units then, if CAGR = 3% is expected, production output after 10 years would be calculated and expected to be:

CAGR-based forecasting consists of a two-step workflow:

1. calculate CAGR from historical data
2. calculate future values assuming historical CAGR

If annual values are predicted then CAGR must be calculated based on annual historical values. If monthly values are predicted then CAGR must be calculated from monthly historical values. And so on.

Below I use CAGR methodology to forecast future automotive production output in number of units produced annually. I use OICA automotive indistry production data to calculate historical annual CAGR and to predict future production output based on the calculated CAGR value.

First step is to read in the data. This step comprises filtering, too.

``````# import packages
library(dplyr)
# import data
# filter out years of interest and Chinese data only
data_df = dplyr::filter(data_df,year>=2005,country=="China")
# view header of filtered table
``````##   year country    total
## 1 2018   China 27809196
## 2 2017   China 29015434
## 3 2016   China 28118794
## 4 2015   China 24503326
## 5 2014   China 23731600
## 6 2013   China 22116825``````
``````# view tail of filtered table
tail(data_df)``````
``````##    year country    total
## 9  2010   China 18264761
## 10 2009   China 13790994
## 11 2008   China  9299180
## 12 2007   China  8882456
## 13 2006   China  7188708
## 14 2005   China  5717619``````

Next, the historical CAGR can be calculated. In this case for years 2005 to 2018:

``````# calculate historical CAGR
cagr = (data_df\$total/data_df\$total[length(data_df\$total)])^(1/(length(data_df\$total)-1)) - 1``````

Using the historical annual CAGR value I predict Chinese automotive production output measured in units produced annually, for 2030:

``````# predict production output in 2030, based on production output in 2018 and based on CAGR value from 2005 to 2018
data_df\$total*(1+cagr)^(2030-2018)``````
``##  119761592``

CAGR-based forecasting is clearly naive. This forecasting methodology only works under strong assumptions, such as e.g. the assumption that growth is unlimited. In the case if automotive vehicle production this is not feasible. Hence, this forecasting methodology may only be applied to a limited time horizon. Moreover, CAGR-based forecasting requires some variance tolerance.