You may have noticed that I switched domains for my blog (don’t forget to update any bookmarks to www.tortureddata.com).  I did this because:

1) I thought it would be cool to have my own domain.

2) I wanted additional control and functionality from what wordpress.com offers, including the capability to embed Shiny apps!

Shiny is a package in R that allows you to create interactive visualizations of your data.  They are extremely user friendly to create, and your apps can be easily uploaded and hosted on shinyapps.io so that you can share them with the world (although there are some limitations in what you can get for free).  You can see my first shiny app below! 

The main data set in the chart above can be downloaded from The Bureau of Labor Statistics (the normalization information comes from The Tax Foundation).  

The top bubble plot shows the average salary for an industry in each state along the x-axis.  Along the y-axis is a multiplier that represents the difference in salary between the bottom 10% of earners and top 90% of earners (a value of 2 on the y-axis means that the top 90% of earners are earning twice as much as the bottom 10%).   The size of each bubble represents the percent of the population in each state that is employed in that particular industry.  The color corresponds to the region of the state (you can turn regions on/off by clicking that region in the legend).  You can also zoom in by highlighting a section of the graph that you want to see in detail (zoom out by double clicking).

You can select the state you want to examine in the bottom bar chart and it will show the average salaries in that state for the occupations that fall into the selected industry.

You can also select whether you want to look at actual or normalized salary data.  The actual salary data is the salary reported for that industry in each state.  The normalized data takes into account the cost of living, and normalizes the salaries to account for the value of a dollar in each state (for example, California and D.C. tend to have the highest salaries across the board, but when you normalize for cost of living, a dollar in California will not get you as far as a dollar in Alabama).

Some professions are more affected by the cost of living normalization than others.  For example, in the actual data for the Arts and Media industry, D.C, California, and New York have the largest salaries and the largest percent of their population in that field as compared to other states .  When you normalize for cost of living, those three states maintain their ranks as the highest paying in the field (but North Carolina breaks out from the pack a bit for 4th place in the normalized data).  On the other hand, when you look at Management, you can see that Delaware, New York, New Jersey and D.C. are have the highest salaries in the actual data, but when you normalize, only Delaware maintains it’s lead, the others fall back into a large cluster and Rhode Island and North Carolina become 2nd and 3rd.

It’s also interesting to see which industries have the most disparity between the top and bottom earners.  It seems that the Food industry has the least disparity, with the top earners getting between 1.4 and 2.4 times what the bottom earners are receiving.  The Legal and Arts and Media fields seems to have the most disparity, with the top earners receiving 2 to 4 times what the bottom earners are receiving in every state except Hawaii (1.8 in Legal).

A few other interesting observations:

  • A higher percentage of people are employed in Protective Service in the Virgin Islands and Puerto Rico than any other state/territory.
  • Transportation pays the best in Alaska (both actual and normalized).
  • D.C. is probably the most skewed area in terms of population in each industry, with a larger than average populations in the Finance, Legal, Science and Arts and Media industries, and lower than average populations in other industries.

Feel free to leave other interesting observations in the comments!