The goal of Tortured Data is to make wise data consumers out of our over-eager data-loving culture. Don't believe me when I say we have a sometimes unhealthy reliance on data? How about the last time you wanted to talk about labeling fruit as GMO? Don't tell me nobody brought out the statistics on how much happier blue whales were before we introduced genetically modified zooplankton off the California coast. So when you were shown that chart (and it invariably looked like the chart below), if you are a data ninja, (as you will be shortly) you repeated the hypnotic chant: "Association is Not Causation" (I'm thinking like Master Splinter when he hypnotized Michelangelo to reject pizza. Splinter = Me, Michelangelo = You).
This data comes from UN Office of Drugs and Crime (UNODC). I took a sampling of countries similar to the U.S. in terms of per capita GDP, political structure, etc. This is essentially Western Europe, North America, Australia and Japan. If you're for gun control, you focus on Japan and the U.S. (circled in blue) and draw a line roughly like the blue line shown (an actual regression looks similar to this). If you're against gun control, you look at Turkey and Iceland (circled in red) and draw a line like the red line shown (this is roughly what you'd get in a regression if you dropped the U.S.)
(Disclaimer: I really don't care about gun politics right now. See the forest, not the tree young padawan).
Let's ignore the red now. As I said, an actual regression line looks like the blue line shown. Thus the implication is more guns = more crime, and the obvious corollary, fewer guns = less crime. Thus, let us work to remove guns in any way possible. We could restrict the sale of guns in the U.S. so that we have no more than 10 guns per 100k people. That would put us right near Spain and Italy, and according to this ineffable formula for crime reduction, we would have roughly 1.25 crimes per 100k people. That's a near 75% reduction in crime!
("But Aaron, the 2nd amendment, the right to bear arms-" No grasshopper! Listen to your sensei, and focus!)
I've created a formula that would single-handedly produce the greatest reduction in crime in the history of the US! Though this beautiful graph is so compelling, we must remember: association is not causation. If you're not saying that aloud, you can never expect to be Michelangelo! Now repeat! "Association is not causation!" Shout it till you mean it!
There may be a statistical association between firearms per 100k people and crime per 100k people, but that does not mean that firearms cause greater crime. More guns is associated with more crime, but we cannot say from this data alone that more guns causes more crime. Establishing causality is hard but possible in some cases with tricky math and other cool things (technical term).
In this gun scenario, we could just as easily say that crime causes the number of guns per 100k people to go up. Thus if we want to remove guns from a nation, we simply have to lower crime. BAM! *That was logic hitting you square in the face!*
To finish this up, another fun chart (real data).
As you can see, perhaps an even more viable way to reduce violent crime in the U.S. would be to eliminate all use of Internet Explorer. Bet you didn't know that Firefox and Google Chrome were the greatest crime deterrents we know of! But before you write your representative and call the White House, let's have a lesson on logic. As you now know, association is not causation. Have a look at my awesome diagram.
When we see an associative relationship, there are three possible general scenarios. One is that A causes B (causation shown by solid arrow). Similarly, another is that B causes A. As you can see, these fall under "Causation". However, the third option is the kicker. There may be a third element C that we can't measure or isn't accounted for where C causes both A and B. Thus whenever we have C, we will see both A and B, and it may also be the case that whenever we don't have C, we see less of A and B. This would make A and B associated in our data when they are certainly not causal. (The other possible explanation which I haven't included is simply that the association is purely coincidental as is likely the case in the Internet Explorer market share and murders in the U.S. especially if the scale is messed with).
Less abstractly, in the case of guns and crime rates, it may be the case that more guns cause more crime, or that more crime causes there to be more guns. If either of these causal relationships is established, it's much easier to know what to do if our goal is to lower crime or lower gun ownership. However, there may be another element that we're not accounting for or can't measure. An example would be cultural factors that lead to increased crime (disproportionate income distribution in urban areas, racial conflicts, prevalence of gangs, etc.) Those cultural factors may lead to more U.S. citizens owning guns for protection while also being a cause for more crime. But if this is the case, it would mean that increased gun ownership is not causing violence.
Another example of an unknown variable could be that we have an extremely happy blue whale population off the coast of California (because we labeled all GMO food for blue whales with a bright yellow sticker and, health conscious as blue whales are, they stopped eating it), and happier blue whales means less krill but more other food for other sealife, which means Americans eat more seafood, and since everybody knows that gangs and mobs eat a ton of seafood, we have a ton of gangs and mobs, and thus more violence. Seafood also causes extreme paranoia (because of the mercury) so Americans feel a stronger need to own guns (breathe) so clearly, happier blue whales has caused more crime and more gun ownership. Damn those yellow labels! Monsanto was right all along!