In this post I want to show another public data source related to automotive industry. OICA, the International Organization of Motor Vehicle Manufacturers, provides a series of statistics on its website, including sales and production statistics. The comprises all manufacturers world-wide, and considers passenger as well as commercial vehicles.
OICA production statistics can be accessed here: http://www.oica.net/category/production-statistics/
I scraped OICA’s website for production statistics and compressed all values into a single table, containing production output by country from 2005 to 2018.
In the R code below I read in the data from an .xls-file and visualize the statistics for relevant countries. The packages applied are readxl, dplyr and ggplot2. In combination with dplyr I the grepl function for filtering out certain countries only.
library(readxl) library(dplyr) data_df = as.data.frame(read_xls("oica.xls")) data_df = dplyr::filter(data_df,year>=2005) head(data_df)
## year country total ## 1 2018 Argentina 466649 ## 2 2018 Austria 164900 ## 3 2018 Belgium 308493 ## 4 2018 Brazil 2879809 ## 5 2018 Canada 2020840 ## 6 2018 China 27809196
library(ggplot2) data_df = dplyr::filter(data_df, grepl('Germany|China|Japan|USA', country)) ggplot(data_df) + geom_col(mapping=aes(x=year,y=total/1000,fill=country)) + scale_fill_manual(values = c(Germany="red", USA = "black", China = "orange", Japan= "blue")) + labs(title="annual passenger car and commercial vehicle production output", subtitle="data by OICA, for 2005 - 2019 (Germany, Japan, USA, China)") + xlab("year") + ylab("production output [thousands of units]")
Data scientist focusing on simulation, optimization and modeling in R, SQL, VBA and Python