Did you know that US crude oil imports from Norway correlate almost perfectly with drivers killed in collision with railway train? It’s true! Obviously, Norwegian oil magnates are murdering Americans with trains now.
It’s all from a little site called Things that correlate, which takes any one set of numbers you choose, and then dredges through a database of other numbers to find similar patterns. This is going to be useful next time I have to teach my genetics students a little basic statistics: correlation is not causation, and you can cherry pick data sets to find all kinds of meaningless patterns.
anne mariehovgaard says
Unnskyld!
blf says
Biology/biomedical doctorates awarded (US) correlates with Import price of Uranium (US): 0.970393
Letters in Winning Word of Scripps National Spelling Bee correlates with Occupants of pick-up truck or van killed in collision with stationary object: 0.842844
ilgeo says
The idea that correlation does not necessarily imply causation has always been tough to teach. This site drills the concept in very effectively.
WhiteHatLurker says
Related:
Robert Novy-Marx, Predicting Anomaly Performance with Politics, the Weather, Global Warming, Sunspots, and the Stars, Journal of Financial Economics, forthcoming.
WhiteHatLurker says
Going through the website, there are a few cases where correlation makes sense – consumption of cheese.
René says
Is that a correct English spelling, motzarella?
unclefrogy says
there is something about that reminds me of technical stock analysis especially the graphs
uncle frogy
twas brillig (stevem) says
Yes, because it does imply causation. The important thing, to teach, is to then try to prove it!
The problem that the aphorism “Correlation is not causation” is that too often people think a single correlation is absolute proof of causation. While contrarians will point out, “But doesn’t all of science begin with someone seeing a correlation between two events? If correlation did NOT indicate causation, science would never get anywhere.” And the ultimate contrary: “If A causes B, B has to be correlated to A. So how can you say correlation is not due to causation?”
I agree, it is very tough to teach that implication does *not* equal certainty. That correlation just points at something to look at more closely. It’s also important to teach that unexpected non-correlation is worth looking into. There was a great Scientist who is said to have said, when his experiment didn’t give the results he was expecting, “That’s interesting, let me think about why that happened.”, rather than just saying his assistant botched the experiment, or throwing the results in the trash, etc.
mikeyb says
Of course we have to be careful because this works both ways. CO2 correlates with mean global temperature increases, but it is also a major cause. Pseudo-skeptics make these sorts of arguments to deny evolution as well as climate change, age of the earth or how inequality is correlated to low tax rates. I think the difference is that when there are independent measurements and approaches to correlate variables, the case for causation becomes much stronger. For example, there are independent radiometric dating methods which correlate the age of the earth with different radioactive elements, which make the case of an old earth nearly certain.
Correlation does not imply causation, scientific consensus does not equal settled science. But when several independent sets of correlations occur, and the scientific consensus is based on a broad array of scientific facts, then a much stronger case can be made.
I just worry that this is used just as easily to ‘refute’ good science as it is to support pseudoscience.
naturalcynic says
Or, inverse correlations, like: pirates and global warming
Trebuchet says
Not as far as I know, but it does show up in Google.
2001 and 2007 were apparently good years for pizza.
Kagehi says
lol Deaths caused by amputation of limbs vs. money spent on pets
Man, glad people figured out that buying pet food and chew toys was better than ripping of people’s limbs to feed/entertain their pets. ;)
Some of these are just…
Terska says
The problem with “correlation is not causation” is that now every troll on the internet thinks correlation means nothing at all. Correlation can be a pretty good start.
Terska says
I chose precipitation in AL. and it randomly chose to correlate it to precipitation in MS. Not so Spurious.
Inaji says
Males in Wisconsin who slipped or tripped to their death
correlates with
Cost of red delicious apples (unadjusted)
F [i'm not here, i'm gone] says
Huh, apparently nothing on solar activity and the rate of decay of radioisotopes.
ibyea says
Number of Nic Cage movies correlate with people drowning in swimming pool by 0.67.
ck says
So, to solve the problem with honey bees disappearing, we need to promote divorce in South Carolina (correlation: 0.903906), and promote marriage in Vermont (correlation: 0.937689). Getting rid of some ABA lawyers might help, too (correlation: -0.924914).
anuran says
My favorite is Consumption of Cheese and People Dying Entangled in Bedsheets at 0.95. Winsor McKay was right with his “Dream of the Rarebit Fiend”
Crip Dyke, Right Reverend Feminist FuckToy of Death & Her Handmaiden says
No one has linked to this yet?
Make sure you read the mouseover text.
@Kagehi:
Yep: clearly the best ones are the ones where you can almost picture a plausible causation mechanism: consumption of mozzarella cheese with civil engineering doctorates awarded, for instance. Kentucky marriages and deaths by falling out of a boat is another.
Those are also the best ones for really teaching the scientific method.
chigau (違う) says
re: Crip Dyke’s link
http://m.xkcd.com/552/
for those whose devices don’t show alt-text.
numerobis says
About KY marriages v death by falling out of a boat, someone on my FB commented:
numerobis says
I figure it’s the rational argument against gay marriage that they should have tried, rather than letting some judge overturn the discriminatory law and violate the sacred principle of mob rule.
Paul Brown says
How cool is that site?
And here’s the thing. Buried in that site’s corpus, there are some (a few?) genuine examples of causation, if only in the ‘C’ causes ‘A’ and ‘B’ variety.
For example, Number of people who died by becoming tangled in their bedsheets correlates with Total revenue generated by skiing facilities (US). Why? Because each year there are more people! And ( I think it’s fair to say) population increases cause the number of people who suffered death by bed-sheet entanglement and revenue at skiing facilities to go up. Put that in your Bayesian prior and smoke it!
So yeah – superb teaching tool. Both about the limits and the utility of correlation.
ernezabet says
@20, 21 thanks for the link!
jrfdeux, mode d'emploi says
Woohoo! A link I posted made it to the front page! :-D
ChasCPeterson says
This is really fucking stupid. Correlations are properly used to test hypotheses. That is, you need a valid hypothesis of a relationship (not necessarily causation, but a reason for suspecting a relationship) before it’s even worth calculating a correlation coefficient. Then you can calculate the probability associated with an observed correlation coefficient to infer the likliehood of the relationship being meaningful. To just throw random variables together is disdained by statisticians as “fishing expeditions”. It means nothing, proves nothing, and demonstrates nothing. I don;t even see what’s amusing about it.
“Spurious” indeed. Actually, I’d re-name the site “Fucking Stupid Correlations”.
chigau (違う) says
Chas
Did you go to the site?
The whole point is that it’s arbitrary and silly.
numerobis says
What is the emoticon to denote the sound of the point flying high overhead?
Amelia Lewis says
My favorite used to be “consumption of ice cream” and “death by drowning”, which are apparently very strongly correlated.
(both rise with the temperature, btw … though this may be less true now than when I were a tad)
More heat == more frozen treats && more jumping into water to splash about and cool off.
Amy!
Kagehi says
Wow man, its like.. all holistic and stuff dude, chill. Like, just a second, my calculator is telling me that the math I just did is, “A Suffusion of Yellow”. Wow, cosmic!
lol Sorry, just had to. Because, you just know there is someone out there, some place, that might actually “think” some of these things really are connected.
rorschach says
700 (presumably) Americans die each year by becoming tangled in their bedsheets? How does that work?
mykroft says
@rorschach
It might have something to do with forgetting anniversaries….
Crip Dyke, Right Reverend Feminist FuckToy of Death & Her Handmaiden says
Okay, I had to come back b/c I had a few minutes and I went back to the site for more and found this gem:
So then….some of the correlations *aren’t* spurious?
That one had me gasping.
ck says
@32,
Perhaps infants, people with physical disabilities, and the like?
unclefrogy says
I have heard it argued before that if there things that are easy to find out and have a strong correlation with something else that is hard to predict you can make money by betting on what the hard thing will be in the future.. that seems to be how many stock market indicators are used as a way to predict price movement that is along with the secret formulas (proprietary methods) the story of how they beat the market of course sometimes they just use phoney numbers to fake it or inside information to cheat.
The story is always better analysis of correlations.
as has been said it is ridicules and silly but some times it works and it always sells. it is a real thing.
uncle frogy
knowknot says
What I learned today:
Bicyclists tend to run into things and die because they are easily distracted by clumsy women dying.
(Except in 2007, when bicyclists were, for some reason, less interested in clumsy dying women, even though clumsy women apparently didn’t like being ignored and put in some extra effort into dying.)
Project for tomorrow:
Finding ways to incite gratuitous concern in gullible persons.
Or, if I can find some kind of correlation between degrees awarded in fields related to optics and admissions to inpatient mental health facilities, create a whole new field of research (and accompanying thread) for medic0506.
CaitieCat, getaway driver says
I find the aphorism more useful if expressed as “correlation does not necessarily imply causation.”