A short tutorial on how to retrieve and analyze GDP related data from OECD database using their interface in R.
In order to be able to retrieve the data directly in R I need to use the OECD package
# import OECD library (package)
library(OECD)
# search for datasets releated to "GDP" figures
as.data.frame(search_dataset("gdp"))
## id
## 30 FIGURE1_E_AEO2013
## 63 SNA_TABLE1
## 114 PPPGDP
## 228 AEO11_OVERVIEW_CHAPTER1_TAB2_EN
## 432 AEO11_COUNTRYNOTES_TAB5_EN
## 469 AEO11_INDEPTH_CHAPTER6_FIG11_EN
## 472 AEO11_INDEPTH_CHAPTER6_TAB11_EN
## 474 AEO11_COUNTRYNOTES_TAB4_EN
## 481 AEO11_COUNTRYNOTES_FIG2_EN
## 627 AEO11_COUNTRYNOTES_TAB2_EN
## 744 AEO2012_CH6_FIG20
## 772 AEO2012_CH1_FIG4
## 781 AEO2012_CH2_FIG1B
## 783 AEO2012_CH2_FIG4
## 784 AEO2012_CH2_FIG5A
## 789 AEO2012_CH2_FIG9
## 876 FIGURE1_W_AEO2013
## 925 SNA_TABLE1_SNA93
## 931 SNA_TABLE1_TRAINING
## 945 FIGURE1_N_AEO2013
## 946 FIGURE1_C_AEO2013
## 948 FIGURE1_S_AEO2013
## 965 TABLE2_AEO2013_V2
## 966 TABLE3_AEO2013_V2
## 967 TABLE4_AEO2013_V2
## 998 PDB_LV
## 999 PDB_GR
## 1133 SNA_TABLE1_ARCHIVE
## title
## 30 Figure 1: Real GDP growth 2013 (East)
## 63 1. Gross domestic product (GDP)
## 114 Purchasing Power Parities for GDP and related indicators
## 228 Table 1.1: Growth by regions (real GDP growth in percentage)
## 432 Table 5: Current account (percentage of GDP)
## 469 Figure 6.11: Africa’s post-HIPC debt (external debt in percentage of GDP, 1995-2009)
## 472 Table 6.11: Ethiopia, public debt in percentage of GDP
## 474 Table 4: Public finances (percentage of GDP)
## 481 Figure 2: Stock of total external debt (percentage of GDP) and debt service (percentage of exports of goods and services)
## 627 Table 2: GDP by sector (in percentage)
## 744 Figure 20: Household enterprises are the fastest growing livelihood sector in low income countries, ordered by GDP per capita
## 772 2012 Figure 1.4: Growth of GDP by countries (%)
## 781 2012 Figure 2.1b: Domestic and external financial resources (% GDP, 2010)
## 783 2012 Figure 2.4: Oil-importing countries attracted more FDI as a share of GDP than oil-exporting countries
## 784 2012 Figure 2.5a: African FDI outflows mainly go from resource-rich countries to OECD nations (% of GDP)
## 789 2012 Figure 2.9: Tax revenues in Africa represent an increasing share of GDP during the last decade
## 876 Figure 1: Real GDP growth 2013 (West)
## 925 1. Gross domestic product (GDP), SNA93
## 931 1. Gross domestic product (GDP) Training2
## 945 Figure 1: Real GDP Growth 2013 (North)
## 946 Figure 1: Real GDP growth 2013 (Central)
## 948 Figure 1: Real GDP growth 2013 (South)
## 965 Table 2: GDP by Sector (percentage of GDP)
## 966 Table 3: Public Finances (percentage of GDP)
## 967 Table 4: Current Account (percentage of GDP)
## 998 Level of GDP per capita and productivity
## 999 Growth in GDP per capita, productivity and ULC
## 1133 1. Gross domestic product (GDP), 2019 archive
I am interested in gross domestic product, so I will pull the database with ID “SNA_TABLE1”. Before doing so I will request the structure of the dataset as a list and print it in orderfor me to be able to understand the composition of the dataset:
# retrieve structure of the data
structure_ls <- get_data_structure("SNA_TABLE1")
# view structure
#structure_ls
Using the information from the structure output I choose to retrieve data for Germany only. Moreover, I want data for annual gdp figures measured by output appraoach only, between 2000 and 2019:
# retrieve dataset for gross domestic product
data_df <- as.data.frame(get_dataset(dataset = "SNA_TABLE1",
filter = list(
c("DEU"),
c("B1_GA")),
start_time = 2000,
end_time = 2019))
# show the header of that dataset
head(data_df)
## LOCATION TRANSACT MEASURE TIME_FORMAT UNIT POWERCODE REFERENCEPERIOD obsTime
## 1 DEU B1_GA C P1Y EUR 6 <NA> 2000
## 2 DEU B1_GA C P1Y EUR 6 <NA> 2001
## 3 DEU B1_GA C P1Y EUR 6 <NA> 2002
## 4 DEU B1_GA C P1Y EUR 6 <NA> 2003
## 5 DEU B1_GA C P1Y EUR 6 <NA> 2004
## 6 DEU B1_GA C P1Y EUR 6 <NA> 2005
## obsValue
## 1 2109090
## 2 2172540
## 3 2198120
## 4 2211570
## 5 2262520
## 6 2288310
Now, I use dplyr for post-filtering the data:
library(dplyr)
library(magrittr)
# dplyr select function
data_df <- data_df %>% select(MEASURE,TIME_FORMAT,UNIT,POWERCODE,REFERENCEPERIOD,obsTime,obsValue)
# dplyr filtering
data_df <- data_df %>% filter(MEASURE == "C",
UNIT == "EUR",
POWERCODE == "6",
TIME_FORMAT == "P1Y")
# dplyr second selection step
data_df <- data_df %>% select(obsTime,obsValue)
# view data
head(data_df)
## obsTime obsValue
## 1 2000 2109090
## 2 2001 2172540
## 3 2002 2198120
## 4 2003 2211570
## 5 2004 2262520
## 6 2005 2288310
Now I can construct a path chart with German gdp figures measured in EUR by year 2000 – 2019:
library(ggplot2)
ggplot(data_df) +
geom_path(mapping = aes(x=as.numeric(obsTime),y=obsValue/1000), color = "black") +
ggtitle("German GDP between 2000 and 2018") +
xlab("year") +
ylab("in billions of EUR") + ylim(0,4000)
In some of my other posts I am using the OECD package in R for runnning a k-means clustering algorithm on top of OECD data. Also, I have demonstrated how to e.g. access inland freight or other transport data from OECD in R. Lastly, I have written posts on how to use public FRED data in R, using the fredr package – you might want to check that post out too.
Data scientist focusing on simulation, optimization and modeling in R, SQL, VBA and Python
Leave a Reply