A noteworthy addition to Sagan’s baloney detection kit

You can lie with numbers as effectively as you can with words, so this collection of rules for critically evaluating Big Data claims is timely. I think they missed at least one, though. It was caricatured in a recent xkcd:

I’m seeing a lot of these lately. For example, here are the most popular porn searches by state. I’m sorry to say that this is mostly garbage data, useless to everyone.

These data are produced by basically subtracting (or dividing) away the mean and amplifying the differences. I suspect that there is a great sea of banal commonality to porn searches, and they’re more or less the same everywhere…but all the similarities are erased to accentuate slight variations that might be minuscule. If everyone in America were searching for “insect porn”, which would be an interesting and weird piece of data, and one guy in New Hampshire slipped and typed in “incest porn” instead, New Hampshire would be lit up in these maps as the freaky state that wasn’t watching mantis copulation videos.

You might be saying to yourself that this is a trivial example — OK, let’s be cautious in interpreting data techniques that rely on amplifying minor differences, since they can mislead you about the overall state of the system. However, there is a technique that gets published all over the place in the scientific literature that is doing exactly the same thing: it’s called fMRI. This is not to imply that MRI data is bogus, because it’s very good at detecting consistent differences in pattern, but it’s also very good at highlighting chance variation, and it takes a lot of processing to smooth out the roughness in the raw data, and the whole point of the technique is to erase background activity.

I used to do ratiometric imaging, which has similar potential pitfalls. We used a fluorescent dye that would exhibit subtle wavelength shifts in the presence of calcium, so you would visualize activity in the brain by taking a photo at one wavelength, and then a second photo at a slightly different peak wavelength a fraction of a second later, and then taking the ratio of the two. If there was no shift at all, the images would be identical, so every pixel would have a ratio of 1 — which we’d scale to a displayed color of black. If a pixel was fluorescing a little more at the second wavelength, you’d have a ratio slightly greater than 1, and we’d pseudocolor that to something a little brighter.

Again, this is a perfectly legitimate processing technique and the fluctuations we observed were valid and consistent, and you could even calibrate the ratios against known concentrations of calcium and get good estimates of the actual amount of free calcium at each time point in a recording. However, here’s the thing: if you looked at the raw data, or if you looked through the eyepieces at the tissue, you’d see that everything was gently glowing, and that there was actually artifactual fluorescence all over the place, and there was also continuous, low-level calcium flux everywhere, all the time. That information was discarded. Also, when you see the pseudocolored images rendered by these sort of techniques, there’s an awful lot of point variation that is smoothed away, because we tend not to like our pretty pictures spattered with lots of salt-and-pepper noise. We blur it all out.

This is a special problem for methods like MRI, which tend to be at a painfully low resolution (each pixel represents thousands upon thousands of cells), and is also grossly indirect — it’s measuring oxygen and blood flow, not actual electrical activity.

To that list of Big Data cautions, I’d also add that you have to be conscious of what is actually being measured, limitations of the technique, and how you can be misled by assumptions about the resolution. Data can be massaged into all kinds of ridiculous conclusions if you’re not aware of every step of its manipulation.

100 penises

As you might guess, this collection of photos is not safe for work, even though there is nothing particularly prurient about it. One hundred men stood in nearly identical poses, and were then photographed between waist and thighs, and there they are, a hundred weird-looking dinguses in an array.

What’s striking is how much variability there is. It looks to me like evolution has not been paying much attention to this feature: they all work well enough so the differences really don’t matter much. “Normal” is a word that covers a surprisingly wide range here.

Wait, what? Who is welcoming exemption from ethical review?

This will not end well. Social scientists are happy to see human studies rules relaxed.

If you took Psychology 101 in college, you probably had to enroll in an experiment to fulfill a course requirement or to get extra credit. Students are the usual subjects in social science research — made to play games, fill out questionnaires, look at pictures and otherwise provide data points for their professors’ investigations into human behavior, cognition and perception.

But who gets to decide whether the experimental protocol — what subjects are asked to do and disclose — is appropriate and ethical? That question has been roiling the academic community since the Department of Health and Human Services’s Office for Human Research Protections revised its rules in January.

The revision exempts from oversight studies involving “benign behavioral interventions.” This was welcome news to economists, psychologists and sociologists who have long complained that they need not receive as much scrutiny as, say, a medical researcher.

I would have expected social scientists to be even more acutely aware of the bias of self-interest than us clueless nerds over in the other sciences. If there’s anything we should have learned from the history of scientific experimentation it’s that scientists do not provide good ethical oversight of their own research. Some do seem to know that.

“Researchers tend to underestimate the risk of activities that they are very comfortable with,” particularly when conducting experiments and publishing the results is critical to the advancement of their careers, said Tracy Arwood, assistant vice president for research compliance at Clemson University.

Yes. Onerous and annoying as they are, we have human research review committees to specifically provide input from outside the blinkered perspective of the researcher. That’s necessary. Not everyone sees it that way.

A vocal proponent of diminishing the role of institutional review boards is Richard Nisbett, professor of psychology at the University of Michigan and co-author of the opinion piece in The Chronicle of Higher Education.

Social science researchers are perfectly capable of making their own determinations about the potential harm of their research protocols, he said. A behavioral intervention is benign, he said, if it’s the sort of thing that goes on in everyday life.

“I can ask you how much money you make or about your sex life, and you can tell me or not tell me. So, too, can a sociologist or psychologist ask you those questions,” Dr. Nisbett said.

He’s a psychology professor, and he thinks that in a study in which a professor is asking personal questions of a student, there are no social pressures on the student, and they are completely free to ignore the question? Jesus. I guess I wouldn’t trust any papers published by that guy, then.

If those questions are really benign, then shouldn’t the study proposal fly through the institutional review board process without a hitch?

I mean, I could propose a whole bunch of experiments that involve having students drink lots of vodka before undergoing various cognitive tests, and drinking to excess is the sort of thing students do in everyday life, so it must be benign, and why should I get an IRB to rubber stamp something that the students are doing anyway, am I right?

You don’t want to know what I could argue biology experimenters ought to be able to do without outside assessment because students are already doing it anyway.

The pig-man gets more press

The media do like their kooks. They’re far more entertaining than the truth. So once again, the ludicrous Eugene McCarthy, the man who believes humans are the product of hybridization between pigs and chimpanzees, gets a long write-up that dwells far too long on McCarthy’s pathetic rationalizations. His justifications are superficial and often wrong: they amount to looking in a mirror, and noticing that we’re kinda mostly hairless, just like pigs, and we have lots of body fat, like pigs, and we have organs, just like pigs, and we’re bipedal, just like pigs, and we have tusks, just like pigs, and we have nipples, just like pigs, and we have 12 of them, just like pigs, and we’re even-toed ungulates, just like pigs…you get the idea. He’s an idiot, but he’s an idiot who makes long pseudoscientific lists with sciencey terms, so he impresses the rubes.

His ill-informed views get another long airing in which he gets to present his self-pitying schtick of being a martyr to intolerant scientists, and how he’s a true revolutionary who’s going to change the modern paradigm. He’s basically full of shit. The whole article could have been truncated to its early statement of the premise:

Since the early ‘80s, he has believed that humans are the result of an errant sexual encounter between our closest relative, the chimpanzee, and the animal with which we seemingly share all aforementioned traits: the pig.

Followed by this one paragraph buried deep in the story:

The most damning refutation of McCarthy’s hypothesis is “the absence of any pig or pig-related genes in the human genome,” according to Roger Butlin, a professor of evolutionary biology at the University of Sheffield in Britain. Instead, the human genome is “entirely consistent” with the explanation that humans are great apes, most recently sharing an ancestor with living chimps and bonobos, he said.

Yep. If we were pig-chimp hybrids, it would jump out at us from the data. The sequences aren’t there. We’re done.

But the article goes on. McCarthy has an excuse — he always has an excuse.

If McCarthy did crave more recognition from mainstream experts (he doesn’t, he insists), his best bet would be to look for a signature unique to pigs in the DNA of humans but not other apes, said John McDonald, a biology professor at the University of Georgia and a former advisor of McCarthy’s.

A few years ago, McCarthy tried to do just that. He and a friend wrote a computer program to search the human genome for traces of pig hybridization. But the task was too computationally intensive. “It would have taken a lifetime to process the data on the small computers we had access to,” he said.

So he tried to reinvent BLAST, a publicly accessible program that you can run on NIH’s computers over the internet, and couldn’t get it to go. You know, ya great goofy loon, you could also pick up any of a number of molecular phylogeny papers and find that other people have done the work for you. That’s what molecular phylogeny is all about: you gather a bunch of DNA sequences from a bunch of different species, and you compare them and weigh the differences, and you throw them into a computer program that churns through all the species and all the genes and spits out a summary of how closely related they are. It’s been done! The pig and chimp lineages separated in the Cretaceous.

Can we just be done with this? Media, ignore the clown capering over there — there are good science stories to discuss.

I do have to end with one final quote from McCarthy.

There’s also another reason McCarthy remains so attached to his ideas: He believes altruism, not competition, is the way of the world. With neo-Darwinism and natural selection, competition is a fact of life, and that logic can be used to justify war, conflict, and ethnic cleansing (“Darwin’s biggest fan was Hitler,” he said).

That’s a common creationist claim, but it’s wrong. Hitler was not a fan of Darwin, and even if he were, it would not have the slightest implications for the truth of the theory.

She’s a good dog

Ollie is a good dog, yes she is. Such a good dog! It’s not her fault that she has ended up on the editorial boards of medical journals.

…in one respect, the Staffordshire Terrier differs radically from her canine peers: she has a burgeoning academic career, and sits on the editorial boards of seven medical journals.

As you may have guessed, the journals on whose boards Ollie sits are of the predatory variety. These are shadowy, online publications that mimic legitimate journals, but are prepared to publish anything in exchange for a fee that can run into thousands of dollars. Predatory journals prey on desperate young researchers under huge pressure to get their research published to further their careers.

Ollie’s owner is Mike Daube, Professor of Health Policy at Curtin University in Perth. Ollie likes to watch Mike working on his computer, and Mike gets a lot of emails from predatory journals. Wondering just how low these journals would go, he put together a curriculum vitae for his dog – detailing research interests such as “the benefits of abdominal massage for medium-sized canines” – and sent it off to a number of these journals, asking for a spot on their editorial boards.

She has also been asked to review papers. I suspect she’d be a harsh critic, despite being such a good dog, because usually when you put a paper on the floor they poop on it. And that’s good! Good doggie!