Automotive industry is a key industry for German economy. As such I keep analyzing it out of private interest. My scope is not limited to Germany. E.g. US automotive sector comprises a huge portion of global market volume. In this article I analyze US earnings in automotive industry
In this article I am analyzing data provided by the US bureau of labor statistics. This data can be found here: https://www.bls.gov/iag/tgs/iagauto.htm#earnings
I analyze average monthly earnings per hour, in US automotive industry. At this point I’d like to refer to one of my former posts on US car production volume development since 2000, based on FRED data. The data provided by FRED show a strong decline in US automotive production output as well as net sales of imported vehicles in the US.
The hourly earnings data provided by US bureau of labor statistics can be obtained in the form of a Excel-spreadsheet. Data in Excel sheets can be read into R using the readxl package.
I start by reading in the data:
# read in readxl, a package in R provided on CRAN
library(readxl)
# read in the data
data_df <- data.frame(read_excel("hourly earnings.xls"))
# print the head of the data frame
head(data_df)
## Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1 2010 21.87 21.82 21.68 21.87 21.96 22.05 21.80 21.69 22.51 22.32 22.19 22.34
## 2 2011 21.98 22.10 22.23 22.25 22.19 22.17 21.55 21.52 21.84 21.87 21.68 21.81
## 3 2012 21.72 21.63 21.39 21.58 21.14 21.42 20.82 20.98 20.94 21.05 21.37 21.21
## 4 2013 20.93 20.95 20.82 20.86 20.79 20.84 20.68 21.06 21.30 21.40 21.68 21.57
## 5 2014 21.03 21.22 21.51 21.39 21.37 21.45 21.20 21.36 21.35 21.42 21.66 21.54
## 6 2015 21.30 21.40 21.43 21.48 21.50 21.63 21.42 21.40 21.70 21.42 21.42 21.70
I will now create a new data frame that contains the annual relative changes in average hourly earnings, on a monthly level. Hence, I e.g. compare earnings in Jan 2019 relative to Jan 2018, and earnings in Jan 2018 relative to earnings in Jan 2017 – and so forth. I will structure the new data frame in a ggplot2-friendly way. ggplot2 is another R package. I will use ggplot2 for data visualization. This will comprise the last step of my analysis:
# 12 months in 10 years
rel_df <- data.frame(matrix(nrow=10*12,ncol=4))
# column names added with the colnames-function
colnames(rel_df) <- c("year","month","absoluteValue","annualChange")
# assign month names
rel_df$month <- rep(c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),times=10)
# looping through the data
for(i in 1:10){ # 10 years ; 2010 - 2019
for(j in 1:12){ # 12 months
# assign absolute values
rel_df$absoluteValue[j+(i-1)*12] = as.numeric(data_df[i,j+1])
# assign year
rel_df$year[j+(i-1)*12] = 2009+i
}
}
# make sure year is character
rel_df$year = as.character(rel_df$year)
# calculate annual relative change, per month
library(dplyr)
for(i in c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")){
rel_df[rel_df$month==i,4] = (filter(rel_df,month == i)[,3] - lag(filter(rel_df,month == i)[,3]))/lag(filter(rel_df,month==i)[,3])
}
I now continue by adding date stamps to my dataframe. This will enable me to plot a graph that shows the time series of mean hourly earnings of employees in US automotive industry (not seasonally adjusted). I produce the graph using, again, ggplot2 in R.
# add date index to rel_df, by adding new column named "date"
rel_df$dateIndex <- 1:length(rel_df$absoluteValue)
# visualise time series with ggplot2
ggplot(mapping=aes(x=dateIndex,y=absoluteValue)) +
geom_line(data=rel_df,color="red",size=2,alpha=0.5) +
labs(titel="Average hourly earnings for employees in US automotive industry",
subtitle="Time series development of earnings in US automotive industry, 2010 - 2019") + xlab("time") +
ylab("average hourly earnings [USD] (not seasonally adjusted)") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_blank())
In recent years, earnings in US automotive industry have increased considerably – according to the data provided by US bureau of labor statistics. In another post I have already analyzed FRED data, showing that during the same time period both net car sales of imported vehicles as well as domestic production output have sharply declined in most recent years.
Data scientist focusing on simulation, optimization and modeling in R, SQL, VBA and Python
Leave a Reply