“Science Is Endangered by Statistical Misunderstanding”

He’s baaaaaack. I’ve linked to David Colquhoun’s previous paper on p-values,[1] but I’ve since learned he’s about to publish a sequel.

Despite decades of warnings, many areas of science still insist on labelling a result of P < 0.05 as “significant”.   This practice must account for a substantial part of the lack of reproducibility in some areas of science. And this is before you get to the many other well-known problems, like multiple comparisons, lack of randomisation and P-hacking. Science is endangered by statistical misunderstanding, and by university presidents and research funders who impose perverse incentives on scientists. [2]

[Read more…]

P-values are Bullshit, 1942 edition

I keep an eye out for old criticisms of null hypothesis significance testing. There’s just something fascinating about reading this…

In this paper, I wish to examine a dogma of inferential procedure which, for psychologists at least, has attained the status of a religious conviction. The dogma to be scrutinized is the “null-hypothesis significance test” orthodoxy that passing statistical judgment on a scientific hypothesis by means of experimental observation is a decision procedure wherein one rejects or accepts a null hypothesis according to whether or not the value of a sample statistic yielded by an experiment falls within a certain predetermined “rejection region” of its possible values. The thesis to be advanced is that despite the awesome pre-eminence this method has attained in our experimental journals and textbooks of applied statistics, it is based upon a fundamental misunderstanding of the nature of rational inference, and is seldom if ever appropriate to the aims of scientific research. This is not a particularly original view—traditional null-hypothesis procedure has already been superceded in modern statistical theory by a variety of more satisfactory inferential techniques. But the perceptual defenses of psychologists are particularly efficient when dealing with matters of methodology, and so the statistical folkways of a more primitive past continue to dominate the local scene.[1]

… then realising it dates from 1960. So far I’ve spotted five waves of criticism: Jerzy Neyman and Egon Peterson head the first, dating from roughly 1928 to 1945; a number of authors such as the above-quoted Rozeboom formed a second wave between roughly 1960 and 1970; Jacob Cohen kicked off a third wave around 1990, which maybe lasted until his death in 1998; John Ioannidis spearheaded another wave in 2005, though this died out even quicker; and finally the “replication crisis” that kicked off in 2011 and is still ongoing as I type this.

I do like to search for papers outside of those waves, however, just to verify the partition. This one doesn’t qualify, but it’s pretty cool nonetheless.

Berkson, Joseph. “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association 1942;37:325-35. International Journal of Epidemiology, vol. 32, no. 5, 2003, pp. 687.

For instance, they point to a specific example drawn from Ronald Fisher himself. The latter delves into a chart of eye facet frequency in Drosophila melanogaster, at various temperatures, and extracts some means. Conducting an ANOVA test, Fisher states “deviations from linear regression are evidently larger than would be expected, if the regression were really linear, from the variations within the arrays,” then concludes “There can therefore be no question of the statistical significance of the deviations from the straight line.”

Berkson’s response is to graph the dataset.eye facets vs. temperature, Drosophila Melangaster, graphed and fit to a line. From Fisher (1938).

The middle points look like outliers, but it’s pretty obvious we’re dealing with a linear relationship. That Fisher’s tests reject linearity is a blow against using them.

Jacob Cohen made a very strong argument against Fisherian frequentism in 1994, the “permanent illusion,” which he attributes to a paper by Gerd Gigerenzer in 1993.[3][4] I can’t find any evidence Gigerenzer actually named it that, but it doesn’t matter; Berkson scoops both of them by a whopping 51 years, then extends the argument.

Suppose I said, “Albinos are very rare in human populations, only one in fifty thousand. Therefore, if you have taken a random sample of 100 from a population and found in it an albino, the population is not human.” This is a similar argument but if it were given, I believe the rational retort would be, “If the population is not human, what is it?” A question would be asked that demands an affirmative answer. In the hull hypothesis schema we are trying only to nullify something: “The null hypothesis is never proved or established but is possibly disproved in the course of experimentation.” But ordinarily evidence does not take this form. With the corpus delicti in front of you, you do not say, “Here is evidence against the hypothesis that no one is dead.” You say, “Evidently someone has been murdered.”[5]

This hints at Berkson’s way out of the p-value mess: ditch falsification and allow evidence in favour of hypotheses. They point to another example or two to shore up their case, but can’t extend this intuition to a mathematical description of how this would work with p-values. A pity, but it was for the best.


[1] Rozeboom, William W. “The fallacy of the null-hypothesis significance test.” Psychological bulletin 57.5 (1960): 416.

[2] Berkson, Joseph. “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association 1942;37:325-35. International Journal of Epidemiology, vol. 32, no. 5, 2003, pp. 687.

[3] Cohen, Jacob. “The Earth is Round (p < .05).” American Psychologist, vol. 49, no. 12, 1994, pp. 997-1003.

[4] Gigerenzer, Gerd. “The superego, the ego, and the id in statistical reasoning.” A handbook for data analysis in the behavioral sciences: Methodological issues (1993): 311-339.

[5] Berkson (1942), pg. 326.

Change Of Plans

I’ve had a draft cooking for a while over Laci Green’s view of trans* people. I don’t claim to know why she’s hanging out with MRAs or treating TERFs as if they were feminists, but if she’s going to sit down and attempt to make logical arguments the least I could do is return the favor.

But then this happened. [Read more…]

A Quick Note on So-Called “Bathroom Bills”

[CONTENT WARNING: TERFs]

Gendered restrooms are a relatively recent phenomenon. Before then restrooms were unisex, but not in the way you’re thinking.

… public facilities in Western nations were male-only until the Victorian era, which meant women had to improvise. If they had to be out and about longer than they could hold their bladders, women in the Victorian era would urinate over a gutter (long Victorian skirts allowed for some privacy). Some would even carry a small personal device called a urinette that they could use discretely under their skirts and then pour out, [Sheila] Cavanagh said. […]

This lack of female facilities reflected a notable attitude about women: that they should stay home. This “urinary leash” remains a problem in some developing nations, said Harvey Molotch, a sociologist at New York University and co-editor of “Toilet: The Public Restroom and the Politics of Sharing” (New York University Press, 2010). Women in India today, for example, often have to avoid eating or drinking too much if they have to be out in public, because there is no place for them to go, Molotch told Live Science.

But with the rise of the Industrial Revolution and changing attitudes towards gender, forcing women back into the home wasn’t tenable. Instead, during the last quarter of the 19th century a new philosophy became dominant.

Scientific discoveries at the time showed that working women were “unable to [physically] withstand strains, fatigues, and privations as well as [men],” so sex-separated restrooms provided “a protective haven . . . where a woman could seek comfort and rest when her weak body gave out on the job.” Maintaining separate facilities that were “properly screened” also provided more privacy to both men and women with regard to their bodies and bodily functions, an obsession derived from Victorian society. By providing a separate space for the special needs of women and protecting the privacy of all workers, sex-separated bathrooms upheld “[l]ate nineteenth century concerns about germs and sanitation . . . [and] early nineteenth century ideological concerns of pure womanhood.”

Governments began mandating sex-segregated washrooms in the workplace, starting with Massachusetts in 1887. As attitudes towards women changed, however, the reasons for segregation shifted.

Though modern thinking has certainly progressed and women are not treated as inherently inferior as they once were, the current argument that sex-separated restrooms provide greater safety for women harkens back to the nineteenth century justifications for separate restroom facilities. For example, literature opposing transgender bathroom access focuses heavily on protecting the safety, privacy, and dignity of women and girls, yet rarely mentions any issues men might have with sharing a restroom with a female-to-male transsexual. Even some transsexual women wish to maintain the “safe haven in a male dominated world” of a women’s restroom “where women can have their own space without needing to worry what a man might do (in front of them, to them, or to their daughters and young sons.)” At the very least, these opinions expose an underlying belief that women and girls are more fragile than men, have a deeper need of privacy than men, and are more likely than men to be afraid or offended by the notion of sharing a restroom with a male-born transgender woman.

Faced with this information, you’d think a feminist would tread very carefully. Yes, there’s a gender imbalance in who commits sexual assault, but the historic use of washrooms to control women should give pause about banning someone else from using them.

TERFs don’t pause, they’re fully in favour of “bathroom bills.” Even when a butch lesbian gives a convincing plea against this legislation, they still find a way to justify support.

Those of us who believe that men belong in the men’s washroom come in two major types—conservatives and feminists—but this author doesn’t distinguish between the two groups. Conservatives understand that certain men will use any excuse to prey on women and children and they want to protect them. They are also homophobic and do not accept ordinary lesbians and gays, and they promote traditional gender roles and marriage. Feminists know that men with sexual fetishes like to declare that they have a gender identity and therefore have a right to expose themselves in women’s locker rooms. We differ completely from conservatives because we are against gender roles and sex stereotypes. We want the entire range of women in all our diversity to feel comfortable in women’s spaces, which will be accomplished by eliminating sexism and homophobia. […]

She’s implying here that the reason for sex-segregated facilities is the misguided notion that women need protection from men, and that people only believe that women need protection because of gender roles/stereotypes about women. But in the real world, women do need protection from men, because men abuse women on a regular basis through assault, rape, harassment, stalking, flashing, taking photos without consent, and the list goes on. Unfortunately this writer didn’t check the stats on violence against women before writing her article.

This is evidence that TERFs are not truly feminists: they advocate for the elimination of sex stereotypes, yet push a stereotyped view of sex. They are ignorant of feminist history, and advocate for sexist policies that date back to the Victorian era. The aforementioned division between “conservatives” and “feminists” is rich, especially since the two love to team up to oppose the rights of trans* people.

The real issue behind “bathroom bills” is control over who gets to enjoy the public sphere, security is secondary at best.

Mystery Solved

I’m surprised I don’t read Wonkette more often.

Rachel Maddow did a BIG SCOOP on Thursday night, and we think it’s a pretty big fuckin’ deal. To cut to the chase, somebody (she doesn’t know who YET) used her “Send It To Rachel” tool to send her something that looks like a highly classified document about collusion between Donald Trump and Russia, but is actually a FORGERY. WHOA IF TRUE, right?

It is pretty “whoa,” in fact I was about to sit down and type something up on it until I saw Wonkette scooped me.

What’s fascinating about this weird forgery is that it appears to have been copied off the highly classified document NSA contractor Reality Winner sent to Glenn Greenwald’s The Intercept. Remember how The Intercept published a bombshell on Monday, June 5, that Russians had specifically targeted voting machine manufacturers and election officials during their 2016 cyberwar against American democracy, and that they got further than anybody ever knew? […]

Maddow found the EXACT SAME MARKINGS and the EXACT SAME CREASE on the document she got. Forgery detected! (Later in the segment she explained that there were several other screwy things about the document, including that it actually named a high-up American citizen/Trump campaign person. According to the intelligence experts Maddow consulted, this type of document, if real, wouldn’t name an American all willy-nilly like that.)

There was one intriguing mystery left: the file received by Maddow was created on June 5th, 2017, at 12:17:15, yet the Intercept’s article went online at 13:44 15:44. How could the person who sent the document get access to it before the article was published? I was about to sit down and type about that instead, but…

That’s because time stamps on the documents published by The Intercept designate the creation date included in the PDF we publish on DocumentCloud: In this case, that occurred just over three hours prior to publication of our article. Both versions — the one we published and the one Maddow received — reflect the same time to the second: literally the exact moment when we created and uploaded the document.

In other words, anyone who took the document directly from The Intercept’s site would have a document with exactly the same time stamp as the one Maddow showed. Thus, rather than proving that this document was created before The Intercept’s publication, the time stamp featured by Maddow strongly suggests exactly the opposite: that it was taken from The Intercept’s site.

Ah, thank you Glenn Greenwald. It looks like the Intercept has an automated system to process their documents. Downloading the original for myself, I can tell they use an old-ish copy of ImageMagick to do the grunt work. This probably helps them redact information; the boxes they use to cover information look digitally made, yet are burnt into the source images that make up the PDF. This could have the pleasant side-effect of wiping away the original document’s metadata, if it was digital. On the other hand, I also see the original title was “GRU-final,” which probably didn’t come from the Intercept.

I get something slightly different from Greenwald when I dump the document’s info, though.

File Modification Date/Time : 2017:06:05 13:43:03-06:00
PDF Version : 1.4
Linearized : No
Create Date : 2017:06:05 12:17:15
Modify Date : 2017:06:05 12:17:15
Page Count : 5

In his case, the bolded bit reads “2017:07:06 21:33:15-04:00,” the exact time he downloaded his copy. My tool is slightly newer than his, however, which could easily explain the discrepancy.

So, that’s one mystery solved: the person or people who sent the document to Maddow used the Intercept’s document as a base. That still leaves who sent it, though. Was it the Kremlin,  someone associated with Trump, or somebody else? That one is in the hands of Maddow’s team.

(A hat tip to Lynna, OM in PZ’s Political Madness thread, for the Wonkette article.)

[HJH 2017-07-08: Damn time zones. And I was even going to mention them in my original post…]

The Ouroboros of Hate

I’ll confess I’ve said that if bigots were smart, they wouldn’t be bigots. Reality is a bit more complicated than that, but there is a way to rescue the sentiment.

  1. Opponents of Social Justice movements generally have a poor grasp of social justice concepts.
  2. As a consequence, some of them think these concepts lack any firm meaning. They instead act either as in-group/out-group signifiers, or as synonyms for “I don’t like you.”
  3. As a consequence, some of them have difficulty telling if these concepts are used in their proper manner.
  4. A few opponents of social justice, motivated either by a desire to show #2 to be true or simply to grief, will stage faux social justice campaigns.
  5. As a consequence, the subset mentioned in #3 will think the opponents from #4 are sincere, and given enough exposure may start thinking social justice concepts lack meaning.

I’ve seen this in action; while one group of bigots were trolling me, I saw another group think the trolling was sincere. Just recently, I spotted another example.

Older members of the crowd carried Confederate flags, while the younger, internet-driven masses wore patches with 4chan’s Kekistan banner. Rally-goers in homemade armor and semi-automatic rifles paced Houston’s Hermann Park, waiting for an enemy to appear.

The crowd, several hundred strong, gathered in the park on Saturday to defend a statue of Sam Houston, a slaveholder. They had gathered in response to reports that leftist protesters had planned a rally to remove the statue, despite Houston Mayor Sylvester Turner publicly stating that removing the statue wasn’t “even on my agenda.” But as sniper rifles and Infowars-branded jackets crowded the park, it became evident that the left protesters were not coming. They had never planned to come. The rumors of an antifa protest were actually a hoax, orchestrated by an anti-left group defending Confederate monuments.

I suspect these scenarios are more common than we realize, if only because the same thing happened again a month later.

A “patriot” who brought a revolver to Gettysburg National Military Park Saturday amid rumors of desecration of memorials accidentally shot himself in the leg Saturday. […]

Dozens of self-described Patriots came to the park about noon Saturday after hearing rumors that Antifa protesters might crash the park’s events and try to desecrate memorials. Members of Antifa caused a ruckus in Harrisburg recently at an Anti-Sharia rally and one member was arrested for swinging a wooden pole with a nail attached at a police horse.

The rumors on Saturday appeared to be just that: rumors, as no Antifa members were seen at Gettysburg park Saturday.

The result of all this is a self-supporting feedback loop, where people opposed to social justice keep getting fooled by false flags into thinking social justice is as loopy as they’ve been told, and some of them graduate to generate those false flag campaigns.

Look Around You

Let’s say the Kremlin was responsible for the DNC hack, and deployed Twitter bots and trolls to drive disinformation during the recent US election. You wouldn’t expect something like this to pop up overnight, instead it’s likely Russia has practised on its closer neighbours for years. If this were the case, you’d expect them to have plans and organisations set up to counter Kremlin influence.

Sweden has launched a nationwide school program to teach students to identify Russian propaganda. The Defense Ministry has created new units to seek out and counter Russian attempts to undermine Swedish society.

In Lithuania, 100 citizen cyber-sleuths dubbed “elves” link up digitally to identify and beat back the people employed on social media to spread Russian disinformation. They call the daily skirmishes “Elves vs. Trolls.”

In Brussels, the European Union’s East Stratcom Task Force has 14 staffers and hundreds of volunteer academics, researchers and journalists who have researched and published 2,000 examples of false or twisted ­stories in 18 languages in a weekly digest that began two years ago. […]

France and Britain have successfully pressured Facebook to disable tens of thousands of automated fake accounts used to sway voters close to election time, and it has doubled to 6,000 the number of monitors empowered to remove defamatory and hate-filled posts.

The German cabinet recently endorsed legislation — now before Parliament — to impose fines of up to $53 million on social-media companies that fail to remove posts deemed to be “hate speech.” Some especially notorious recent examples concerning migrants have been traced to Russian origins.

You’d also expect the Kremlin to brag about their online savvy. It would be a national source of strength and pride, after all.

Last February, a top Russian cyber official told a security conference in Moscow that Russia was working on new strategies for the “information arena” that would be equivalent to testing a nuclear bomb and would “allow us to talk to the Americans as equals.”

Andrey Krutskikh, a senior Kremlin adviser, made the startling comments at the Russian national information security forum, or “Infoforum 2016,” held Feb. 4 and 5. His remarks were transcribed by a Russian who attended the gathering and translated for me by an independent European cyber expert. […]

According to notes of Krutskikh’s speech, he told his Russian audience: “You think we are living in 2016. No, we are living in 1948. And do you know why? Because in 1949, the Soviet Union had its first atomic bomb test. And if until that moment, the Soviet Union was trying to reach agreement with [President Harry] Truman to ban nuclear weapons, and the Americans were not taking us seriously, in 1949 everything changed and they started talking to us on an equal footing.”

Krutskikh continued, “I’m warning you: We are at the verge of having ‘something’ in the information arena, which will allow us to talk to the Americans as equals.”

Putin’s cyber adviser stressed to the Moscow audience the importance for Russia of having a strong hand in this new domain. If Russia is weak, he explained, “it must behave hypocritically and search for compromises. But once it becomes strong, it will dictate to the Western partners [the United States and its allies] from the position of power.”

If you live in the United States and focus on news relevant to there, it isn’t that hard to dismiss evidence of Kremlin hacking. They haven’t done it before, right? The US is a tech leader, anyway, and would spot any attempts coming from a mile away.

If you step outside of that bubble, though, you find many more people convinced of the Kremlin’s hand, if only because they’ve felt it themselves.

When Winning Becomes Everything

Before getting to the point, though, do you mind if I be a little petty? Emphasis mine:

I was asked about my observations on technical details buried in the State Department’s release of Secretary Clinton’s emails (such as noting a hack attempt in 2011, or how Clinton’s emails might have been intercepted by Russia due to lack of encryption). I was also asked about aspects of the DNC hack, such as why I thought the “Guccifer 2” persona really was in all likelihood operated by the Russian government, and how it wasn’t necessary to rely on CrowdStrike’s attribution as blind faith; noting that I had come to the same conclusion independently based on entirely public evidence, having been initially doubtful of CrowdStrike’s conclusions.

MMmmmm.

But on to the main point: the day after Thursday’s revelation that “a GOP operative who presented himself as working with Mike Flynn, … actively solicited Clinton emails from hackers he believed to be Russian and assumed to be affiliated with the Russian government,” one of the anonymous sources became nonymous. Meet Matt Tait, a British cybersecurity researcher who’s covered that angle of American politics. Said GOP operative, Peter Smith, approached him to validate the batch of emails that were claimed to be from Hilary Clinton’s private email server.

In my conversations with Smith and his colleague, I tried to stress this point: if this dark web contact is a front for the Russian government, you really don’t want to play this game. But they were not discouraged. They appeared to be convinced of the need to obtain Clinton’s private emails and make them public, and they had a reckless lack of interest in whether the emails came from a Russian cut-out. Indeed, they made it quite clear to me that it made no difference to them who hacked the emails or why they did so, only that the emails be found and made public before the election.

Ignore the whole attribution angle of the DNC hack. Instead, let’s focus on the actions of the Republicans. They had access to illegally-obtained dirt on a rival party, and didn’t care that this dirt was illegal. All that mattered to them was winning.

This isn’t a one-off, either; yesterday I pointed to an old story about another GOP operative, Aaron Nevins, who struck a deal with “Guccifer 2.0” to use the material they gathered from local DNC chapters in local races. That material wound up being used in attack ads, and may have swayed voters. But there was also a recent report which showed that Republicans had extensively gerrymandered electoral districts, guaranteeing themselves safer seats and a greater odds of winning. This lines up with prior reports. Republicans are also notorious for voter suppression, to the point that they openly brag about it and waste taxpayer funds to do it. Voter disenfranchisement? Also a Republican tactic.

This is a party devoted primarily to winning. Their policies and values are secondary, leading to an unending stream of hypocrisy. This explains a lot about why they have so much difficulty governing, the Republicans lack a unified vision to guide policy and rally everyone around. This makes it easy for outside groups to sway Republicans to their side, to the point that they even rely on them to draft some legislation.

This is poisonous for democracy. It must be opposed, no matter your political leanings.